XTQL Walkthrough

XTDB supports two query languages: SQL, and XTQL - our new data-oriented, cross-platform, composable query language inspired by Datalog and relational algebra. XTQL is described using JSON and is supported by various client language drivers that make writing XTQL ergonomic. Drivers are currently available for Java, Kotlin, and Clojure (with more to come soon!).

XTQL queries are structured pipeline of operations - e.g. scanning a table, joining multiple tables/queries, grouping/aggregations and sorting/limiting.

Let’s walk through a brief introduction to XTQL:

Basic operations

To read a table, we use from.

(from :users [{:xt/id user-id} first-name last-name])

{
  "from": "users",
  "bind": [ { "xt$id": { "xt:lvar": "userId" } }, "firstName", "lastName" ]
}

SELECT users.xt$id AS user_id, users.first_name, users.last_name FROM users

We can look up a single user-id by binding it in the from clause:

(from :users [{:xt/id "ivan"} first-name last-name])

{
  "from": "users",
  "bind": [ { "xt$id": "ivan" }, "firstName", "lastName" ]
}

SELECT users.first_name, users.last_name FROM users WHERE users.xt$id = 'ivan'

from is a valid query in isolation but, for anything more powerful, we’ll need a 'pipeline':

(-> (from :users [{:xt/id user-id} first-name last-name])
    (order-by last-name first-name)
    (limit 10))

[
  {
    "from": "users",
    "bind": [{ "xt$id": {"xt:lvar": "userId"}}, "firstName", "lastName"]
  },
  { "orderBy": ["lastName", "firstName"] },
  { "limit": 10 }
]

SELECT users.xt$id AS user_id, users.first_name, users.last_name
FROM users
ORDER BY last_name, first_name
LIMIT 10

XTQL operations are independently composable - unlike SQL/EDN Datalog, for example, you don’t need a whole 'SELECT, FROM, WHERE'/{:find …, :where …} frame every time.

Joins

Joins in XTQL are achieved using Datalog-style unification:

Within a unify, we use 'logic variables' (e.g. user-id, first-name etc in the above example) to specify how the inputs should be joined together.

In this case, we re-use the user-id logic variable to indicate that the :xt/id from the :users table should be matched with the :author-id of the :articles table.

(unify (from :users [{:xt/id user-id} first-name last-name])
       (from :articles [{:author-id user-id} title content]))

{
  "unify": [
    {
      "from": "users",
      "bind": [ { "xt$id": {"xt:lvar": "userId"}}, "firstName", "lastName" ]
    },
    {
      "from": "articles",
      "bind": [ { "authorId": {"xt:lvar": "userId"}}, "title", "content" ]
    }
  ]
}

SELECT u.xt$id AS user_id, u.first_name, u.last_name, a.title, a.content
FROM users u JOIN articles a ON u.xt$id = a.author_id

For non-equality cases, we can use a where clause (where we have a full SQL-inspired expression standard library at our disposal)

;; 'find me all the users who are the same age'
(unify (from :users [{:xt/id uid1} age])
       (from :users [{:xt/id uid2} age])
       (where (<> uid1 uid2)))

{
  "unify": [
    {
      "from": "users",
      "bind": [ { "xt$id": {"xt:lvar": "uid1"}}, "age" ]
    },
    {
      "from": "users",
      "bind": [ { "xt$id": {"xt:lvar": "uid2"}}, "age" ]
    },
    {
      "where": [ { "xt:call": "<>",
                   "args": [ {"xt:lvar": "uid1"}, {"xt:lvar": "uid2"} ] } ]
    }
  ]
}

SELECT u1.xt$id AS uid1, u2.xt$id AS uid2, u1.age
FROM users u1
  JOIN users u2 ON (u1.age = u2.age)
WHERE u1.xt$id <> u2.xt$id

We can specify that a certain match is optional using left-join:

(-> (unify (from :customers [{:xt/id cid}])
           (left-join (from :orders [{:xt/id oid, :customer-id cid} currency order-value])
                      [cid currency order-value]))
    (limit 100))

[
  {
    "unify": [
      {
        "from": "customers",
        "bind": [{ "xt$id": {"xt:lvar": "cid"}}]
      },
      {
        "leftJoin": {
          "from": "orders",
          "bind": [
            {"xt$id": {"xt:lvar": "oid"}},
            {"customerId": {"xt:lvar": "cid"}},
            "currency",
            "orderValue"
          ]
        },
        "bind": ["cid", "currency", "orderValue"]
      }
    ]
  },
  { "limit": 100 }
]

SELECT c.xt$id AS cid, o.currency, o.order_value
FROM customers c
  LEFT JOIN orders o ON (c.xt$id = o.customer_id)
LIMIT 100

Here, we’re asking to additionally return customers who haven’t yet any orders (for which the order-table columns will be absent in the results).

Or, we can specify that we only want to return customers who don’t have any orders, using not exists?:

(-> (unify (from :customers [{:xt/id cid}])
           (where (not (exists? (from :orders [{:customer-id $cid}])
                                {:args [cid]}))))
    (limit 100))

[
  {
    "unify": [
      { "from": "customers", "bind": [ { "xt$id": { "xt:lvar": "cid" } } ] },
      {
        "where": [
          {
            "xt:call": "not",
            "args": [ {
              "xt:exists": {
                "from": "orders",
                "bind": [ { "customerId": { "xt:param": "$cid" } } ]
              },
              "args": [ "cid" ]
            } ]
          }
        ]
      }
    ]
  },
  { "limit": 100 }
]

SELECT c.xt$id AS cid
FROM customers c
WHERE c.xt$id NOT IN (SELECT orders.customer_id FROM orders)
LIMIT 100

(naturally, IN is also available, but uses exists? instead)

Projections

We can create new columns from old ones using with:

(-> (from :users [first-name last-name])
    (with {:full-name (concat first-name " " last-name)}))

[
  {
    "from": "users",
    "bind": ["firstName", "lastName"]
  },
  {
    "with": [
      {
        "fullName": {
          "xt:call": "concat",
          "args": [{"xt:lvar": "firstName"}, " ", {"xt:lvar": "lastName"}]
        }
      }
    ]
  }
]

SELECT u.first_name, u.last_name, (u.first_name || ' ' || u.last_name) AS full_name
FROM users AS u

We can also use with within unify - this creates new logic variables which we can then unify in the same way.

Where with adds to the available columns, return only yields the specified columns to the next operation:

(-> (unify (from :users [{:xt/id user-id} first-name last-name])
           (from :articles [{:author-id user-id} title content]))
    (return {:full-name (concat first-name " " last-name)} title content))

[
  {
    "unify": [
      {
        "from": "users",
        "bind": [ { "xt$id": {"xt:lvar": "userId"}}, "firstName", "lastName" ]
      },
      {
        "from": "articles",
        "bind": [ { "authorId": {"xt:lvar": "userId"}}, "title", "content" ]
      }
    ]
  },
  {
    "return": [
      {
        "fullName": {
          "xt:call": "concat",
          "args": [{"xt:lvar": "firstName"}, " ", {"xt:lvar": "lastName"}]
        }
      },
      "title",
      "content"
    ]
  }
]

SELECT (u.first_name || ' ' || u.last_name) AS full_name, a.title, a.content
FROM users AS u
  JOIN articles a ON u.xt$id = a.author_id

Where we don’t need any additional projections, we can use without:

(-> (unify (from :users [{:xt/id user-id} first-name last-name])
           (from :articles [{:author-id user-id} title content]))
    (without :user-id))

[
  {
    "unify": [
      {
        "from": "users",
        "bind": [ { "xt$id": {"xt:lvar": "userId"}}, "firstName", "lastName" ]
      },
      {
        "from": "articles",
        "bind": [ { "authorId": {"xt:lvar": "userId"}}, "title", "content" ]
      }
    ]
  },
  { "without": [ "userId" ] }
]

SELECT u.first_name, u.last_name, a.title, a.content
FROM users AS u
  JOIN articles a ON u.xt$id = a.author_id

Aggregations

To count/sum/average values, we use aggregate:

(-> (unify (from :customers [{:xt/id cid}])
           (left-join (from :orders [{:xt/id oid :customer-id cid} currency order-value])
                      [oid cid currency order-value]))
    (aggregate cid currency
               {:order-count (count oid)
                :total-value (sum order-value)})
    (with {:total-value (coalesce total-value 0)})
    (order-by {:val total-value :dir :desc})
    (limit 100))

[
  {
    "unify": [
      {
        "from": "customers",
        "bind": [ { "xt$id": {"xt:lvar": "cid"} } ]
      },
      {
        "leftJoin": {
          "from": "orders",
          "bind": [ { "xt$id": {"xt:lvar": "oid"}}, {"customerId": {"xt:lvar": "cid"}}, "currency", "orderValue"]
        },
        "bind": [ "oid", "cid", "currency", "orderValue" ]
      }
    ]
  },
  {
    "aggregate": [
      "cid",
      "currency",
      { "orderCount": { "xt:call": "count", "args": [ {"xt:lvar": "oid"} ] } },
      { "totalValue": { "xt:call": "sum", "args": [ {"xt:lvar": "orderValue"} ] } } ]
  },
  {
     "with": [
       { "totalValue": { "xt:call": "coalesce", "args": [ { "xt:lvar": "totalValue" }, 0 ] } }
     ]
  },
  { "orderBy": [ { "val": {"xt:lvar": "totalValue"}, "dir": "desc" } ] },
  { "limit": 100 }
]

SELECT c.xt$id AS cid, o.currency, COUNT(o.xt$id) AS order_count, COALESCE(SUM(o.order_value), 0) AS total_value
FROM customers c
  LEFT JOIN orders o ON (c.xt$id = o.customer_id)
GROUP BY c.xt$id, o.currency
ORDER BY total_value DESC
LIMIT 100

'Pull'

When we’ve found the documents we’re interested in, it’s common to then want a tree of related information. For example, if a user is reading an article, we might also want to show them details about the author as well as any comments.

(Users of existing EDN Datalog databases may already be familiar with 'pull' - in XTQL, because subqueries are a first-class concept, we rely on extensively on these to express a more powerful/composable behaviour.)

(-> (from :articles [{:xt/id article-id} title content author-id])

    (with {:author (pull (from :authors [{:xt/id $author-id} first-name last-name])
                         {:args [author-id]})


           :comments (pull* (-> (from :comments [{:article-id $article-id} created-at comment])
                                (order-by {:val created-at :dir :desc})
                                (limit 10))
                            {:args [article-id]})}))

;; => [{:title "...", :content "...",
;;      :author {:first-name "...", :last-name "..."}
;;      :comments [{:comment "...", :name "..."}, ...]}]

;; SQL?
;; Yeah, so, erm, 'left as an exercise to the reader'? 😅
;; Lots of nested left-joins, array-aggs, and vendor-specific JSON functions required there to guarantee this same output.

[
  {
    "from": "articles",
    "bind": [{"xt$id": {"xt:lvar": "articleId"}}, "title", "content", "authorId"]
  },
  {
    "with": [
      {
        "author": {
          "xt:pull": {
            "from": "authors",
            "bind": [{"xt$id": {"xt:param": "$authorId"}}, "firstName", "lastName"]
          },
          "args": ["authorId"]
        }
      },
      {
        "comments": {
          "xt:pullMany": [
            {
              "from": "comments",
              "bind": [ { "articleId": { "xt:param": "$articleId" } }, "createdAt", "comment" ]
            },
            { "orderBy": [ { "val": { "xt:lvar": "createdAt" }, "dir": "desc" } ] },
            { "limit": 10 }
          ],

          "args": ["articleId"]
        }
      }
    ]
  }
]

In this example, we use pull to pull back a single map - we know that there’s only one author per article (in our system). When it’s a one-to-many relationship, we use pull* - this returns any matches in a vector.

Also note that, because we have the full power of subqueries, we can express requirements like 'only get me the most recent 10 comments' using ordinary query operations, without any support within pull itself.

Bitemporality

It wouldn’t be XTDB without bitemporality, of course - indeed, some may be wondering how I’ve gotten this far without mentioning it!

(I’ll assume you’re roughly familiar with bitemporality for this section. If not, forgive me - we’ll follow this up with more XTDB 2.x bitemporality content soon!)

In XTDB 1.x, queries had to be 'point-in-time' - you had to pick a single valid/transaction time for the whole query.

In XTQL, while there are sensible defaults set for the whole query, you can override this on a per-from basis by wrapping the table name in a vector and providing temporal parameters:

(from :users {:for-valid-time (at #inst "2020-01-01")
              :bind [first-name last-name]})

(from :users {:for-valid-time :all-time
              :bind [first-name last-name]})

{
  "from": "users",
  "forValidTime": { "at": { "@value": "2020-01-01", "@type": "xt:date" } },
  "bind": [ "firstName", "lastName" ]
}

{
  "from": "users",
  "forValidTime": "allTime",
  "bind": [ "firstName", "lastName" ]
}

SELECT users.first_name, users.last_name FROM users FOR VALID_TIME AS OF DATE '2020-01-01'

SELECT users.first_name, users.last_name FROM users FOR ALL VALID_TIME

You can also specify (from <time>), (to <time>) or (in <from-time> <to-time>), to give fine-grained, in-query control over the history returned for the given rows.
System time (formerly 'transaction time', renamed for consistency with SQL:2011) is filtered in the same map with :for-system-time.

This means that you can (for example) query the same table at two points-in-time in the same query - 'who worked here in both 2018 and 2023':

(unify (from :users {:for-valid-time (at #inst "2018")
                     :bind [{:xt/id user-id}]})

       (from :users {:for-valid-time (at #inst "2023")
                     :bind [{:xt/id user-id}]}))

{
  "unify": [
    {
      "from": "users",
      "forValidTime": { "at": { "@value": "2018-01-01", "@type": "xt:date" } },
      "bind": [ { "xt$id": {"xt:lvar": "userId"}} ]
    }, {
      "from": "users",
      "forValidTime": { "at": { "@value": "2023-01-01", "@type": "xt:date" } },
      "bind": [ { "xt$id": {"xt:lvar": "userId"}} ]
    }
  ]
}

DML

XTQL can also be used to write to XTDB using XTQL DML.

It uses the same query language as above, with a small wrapper for each of the operations. We’re hoping that a reasonable proportion of use-cases that previously required transaction functions to be installed and invoked can now submit DML operations instead.

Insert

We submit insert-into operations to xt/submit-tx. insert-into accepts a query that inserts every result into the given table:

(xt/submit-tx node
  [[:insert-into :users
    '(from :old-users [xt/id {:given-name first-name, :surname last-name}
                       xt/valid-from xt/valid-to])]])

{
  "insertInto": "users",
  "query": {
    "from": "oldUsers",
    "bind": [
      "xt$id",
      { "givenName": { "xt:lvar": "firstName" } },
      { "surname": { "xt:lvar": "lastName" } },
      "xt$validFrom",
      "xt$validTo"
    ]
  }
}

-- we omit the submission boilerplate in the SQL equivalents

INSERT INTO users
SELECT ou.xt$id, ou.given_name AS first_name, ou.surname AS last_name,
       ou.xt$valid_from, ou.xt$valid_to
FROM old_users AS ou

Here we’re preserving the valid-from and valid-to of the rows we’re copying, but we could just as easily specify/calculate a different value. Naturally, substitute your query of choice from above for the from (unify, for example) for an arbitrarily powerful XTQL command.

(As in XT2 SQL, XTQL insert-into behaves more like an upsert - if a document with that xt/id already exists, it will be overwritten for the specified period of valid-time.)

Delete

We can delete documents using queries as well.

For example, to delete all of the comments on a given post, we can submit the following XTQL:

(defn delete-a-post [node the-post-id]
  (xt/submit-tx node
    [[:delete {:from :comments, :bind '[{:post-id $post-id}]}
      {:post-id the-post-id}]]))

{
  "deleteFrom": "comments",
  "bind": [ { "postId": { "xt:param": "$postId" } } ],
  "argRows": [ { "postId": 1 } ]
}

DELETE FROM comments WHERE comments.post_id = ?

Here, we’re passing a dynamic parameter to specify the post to delete.

delete can also take a collection of extra unify clauses. Let’s say instead we wanted to delete all comments on posts by a certain author - we can do that as follows:
```
(xt/submit-tx node
  [[:delete '{:from :comments
              :bind [{:post-id pid}]
              :unify [(from :posts [{:xt/id pid, :author-id $author}])]}
    {:author "ivan"}]])
```
```
DELETE FROM comments
WHERE comments.post_id IN (SELECT posts.xt$id FROM posts WHERE posts.author_id = ?)
```
Additional clauses passed to delete in this way are unified, in the same way variables within a unify are. In this case, because we’ve re-used the pid logic variable, XT implicitly joins the two values - the post-id from the comments table with the xt/id on the posts table.

We can specify a valid-time range to delete using for-valid-time, in a similar manner to the from clause.

For example, if we want to take down all Christmas promotions on the 26th December (but we don’t want to be in the office to actually delete them), we could run the following:

(xt/submit-tx node
  [[:delete '{:from :promotions
              :for-valid-time (from #inst "2023-12-26")
              :bind [{:promotion-type "christmas"}]}]])

{
  "deleteFrom": "promotions",
  "forValidTime": { "from": { "@type": "xt:instant", "@value": "2023-12-26T00:00:00Z" } },
  "bind": [ { "promotionType": "christmas" } ]
}

DELETE FROM promotions
FOR PORTION OF VALID_TIME FROM DATE '2023-12-26' TO END_OF_TIME
WHERE promotions.promotion_type = 'christmas'

Finally, we can delete every entry in a table by omitting any :bind predicates:

(xt/submit-tx node
  [[:delete {:from :comments}]])

{ "deleteFrom": "comments" }

DELETE FROM comments

Update

Update operations are almost exactly the same as deletes, except - well, they update the document rather than deleting it, obviously - but they additionally take a set clause.

(xt/submit-tx node
  [[:update '{:table :documents
              :bind [{:xt/id $doc-id, :version v}]
              :set {:version (+ v 1)}}
    {:doc-id "doc-id"}]])

{
  "update": "documents",
  "bind": [ { "xt/id": { "xt:param": "$doc-id" } }, { "version": { "xt:lvar": "v" } } ],
  "set": [
    {
      "version": {
        "xt:call": "+",
        "args": [ { "xt:lvar": "v" }, 1 ]
      }
    }
  ],
  "argRows": [ { "doc-id": "doc-id" } ]
}

UPDATE documents
SET version = documents.version + 1
WHERE documents.xt$id = ?

Like deletes, updates also accept :for-valid-time and extra sub-queries to join from. You can, for example, copy a value from another related table, or even update a denormalised value:

(xt/submit-tx node
              [[:put-docs :comments {:xt/id (random-uuid), :post-id "my-post-id"}]
               [:update '{:table :posts
                          :bind [{:xt/id $post-id}]
                          :unify [(with {cc (q (-> (from :comments [{:post-id $post-id}])
                                                   (aggregate {:cc (row-count)})))})]
                          :set {:comment-count cc}}
                {:post-id "my-post-id"}]])

{
  "update": "posts",
  "set": [ { "commentCount": { "xt:lvar": "cc" } } ],
  "bind": [ { "xt$id": { "xt:param": "$postId" } } ],

  "unify": [ {
    "with": [
      {
        "cc": {
          "xt:q": [
            {
              "from": "comments",
              "bind": [ { "postId": { "xt:param": "$postId" } } ]
            },
            {
              "aggregate": [ { "cc": { "xt:call": "rowCount", "args": [] } } ]
            }
          ]
        }
      }
    ]
  } ],

  "argRows": [ { "postId": "my-post-id" } ]
}

INSERT INTO comments (xt$id, post_id) VALUES (?, ?)

UPDATE posts AS p
SET comment_count = (SELECT COUNT(*) FROM comments WHERE comments.post_id = ?)
WHERE p.post_id = ?

This is the operation that we envisage replacing a lot of XT1 transaction functions - it makes it trivial to patch a single value within a document, for example.

Erase

We can irretrievably erase a document using an erase query.

Erases also look the same as deletes, but these don’t support :for-valid-time:

(xt/submit-tx node
  [[:erase {:from :users, :bind '[{:email "jms@example.com"}]}]])

{
  "eraseFrom": "users",
  "bind": [ { "email": "jms@example.com" } ]
}

ERASE FROM users WHERE users.email = 'jms@example.com'

Assert

assert-exists and assert-not-exists can be used to assert the state of the database during a transaction. If an assertion fails, the whole transaction rolls back. This can be used to check pre-conditions, for example.

This query asserts that no user has the email james@example.com before inserting a user that does:

(xt/submit-tx node
              [[:assert-not-exists '(from :users [{:email $email}])
                {:email "james@example.com"}]

               [:put-docs :users {:xt/id :james, :email "james@example.com"}]])

{
  "assertNotExists": {
    "from": "users",
    "bind": [ { "email": { "xt:param": "$email" } } ]
  },

  "argRows": [ { "email": "james@example.com" } ]
}

-- not implemented yet
ASSERT NOT EXISTS (SELECT 1 FROM users WHERE email = 'james@juxt.pro')

You can check the xt/txs table to see whether and why a transaction failed:

(xt/q node '(from :xt/txs [{:xt/id $tx-id} xt/committed? xt/error])
      {:args {:tx-id my-tx-id}})

;; =>

[{:xt/committed? false
  :xt/error (err/runtime-err :xtdb/assert-failed
                             {::err/message "Precondition failed: assert-not-exists"
                              :row-count 1})}]

// The transaction payload
[
  {
    "assertNotExists": {
      "from": "users",
      "bind": [ { "email": { "xt:param": "$email" } } ]
    },

    "argRows": [ { "email": "james@example.com" } ]
  }
  {
    "into": "docs",
    "putDocs": [ { "xt/id": "james", "email": "james@example.com" } ]
  }
]

// querying the tx/txs table
{
  "from": "xt$txs",
  "bind": [ { "xt$id": { "xt:param": "$txId" } }, "xt$committed?", "xt$error" ]
}

// result
{
  "xt$committed?": false,
  "xt$error": {
    "@type": "xt:error",
    "@value": {
      "xtdb.error/message": "Precondition failed: assert-not-exists",
      "xtdb.error/class": "xtdb.RuntimeException",
      "xtdb.error/error-key": "xtdb/assert-failed",
      "xtdb.error/data": {
        "row-count": 1
      }
    }
  }
}

Those familiar with XT1 might recognise this as ::xt/match - albeit more powerful because you have the full query language available rather than just matching a whole document.