(from :users [first-name last-name])
Introducing XTQL
In Clojure, XTDB is queryable using two query languages: SQL and XTQL.
XTQL is our new, data-oriented, composable query language, inspired by the strong theoretical bases of both Datalog and relational algebra. These two combine to create a joyful, productive, interactive development experience, with the ability to build queries iteratively, testing and debugging smaller parts in isolation.
-
It is designed to be highly amenable to dynamic query generation - we believe that our industry has spent more than enough time trying to generate SQL strings (not to mention the concomitant security vulnerabilities).
-
It can be used for both queries and transactions.
Let’s start with XTQL queries:
'Operators' and 'relations'
XTQL is built up of small, composable 'operators', which combine together using 'pipelines' into larger queries.
-
'Source' operators (e.g. 'read from a table') each yield a 'relation' - an unordered bag of rows[1].
-
'Tail' operators (e.g. 'filter a relation', 'calculate extra fields') transform a relation into another relation.
From these simple operators, we can build arbitrarily complex queries.
Our first operator is from
:
from
The from
operator allows us to read from an XTDB table.
In this first example, we’re reading the first-name and last-name fields from the users
table - i.e. SELECT first_name, last_name FROM users
:
It’s in the from
operator that we specify the temporal filter for the table.
By default, this shows the table at the current time, but it can be overridden:
-
to view the table at another point in time
-
to view the changes to the table within a given range
-
to view the entire history of the table
(from :users {:bind [first-name last-name]
;; at another point in time
:for-valid-time (at #inst "2023-01-01")
;; within a given range
:for-valid-time (in #inst "2023-01-01", #inst "2024-01-01")
:for-valid-time (from #inst "2023-01-01")
:for-valid-time (to #inst "2024-01-01")
;; for all time
:for-valid-time :all-time
;; and all of the above :for-system-time too.
})
In the from
operator, we can also rename columns, and filter rows based on field values:
-
We rename a column using a binding map:
(from :users [{:xt/id user-id} first-name last-name])
SELECT _id AS user_id, first_name, last_name FROM users
-
We can look up a single user-id by specifying a literal in the binding map:
(from :users [{:xt/id "ivan"} first-name last-name])
SELECT first_name, last_name FROM users WHERE _id = 'ivan'
Another source operator is rel
, which allows you to specify an inline relation.
You can check out the source operators reference for more details.
Pipelines
We can then transform the rows in a table using tail operators, which we pass in an operator 'pipeline'. Pipelines consist of a single source operator, and then arbitrarily many tail operators.
Here, we demonstrate SELECT first_name, last_name FROM users ORDER BY last_name, first_name LIMIT 10
, introducing the 'order by' and 'limit' operators:
In Clojure, we use ->
to denote a pipeline - in a similar vein to the threading macro in Clojure 'core' [2], we take one source operator and then pass it through a series of transformations.
(-> (from :users [first-name last-name])
(order-by last-name first-name)
(limit 10))
By building queries using pipelines, we are now free to build these up incrementally, trivially re-use parts of pipelines in different queries, or temporarily disable some operators to test parts of the pipeline in isolation.
Other tail operators include where
(to filter rows), return
(to specify the columns to output), with
(to add additional columns based on the existing ones), and aggregate
(grouping rows - counts, sums, etc).
For a full list, see the tail operators reference.
Multiple tables - introducing unify
Joining multiple tables in XTQL is achieved using Datalog-based 'unification'.
We introduce the unify
source operator, which takes an unordered bag of input relations and joins them together using 'unification constraints' (similar to join conditions).
Each input relation (e.g. from
) defines a set of 'logic variables' in its bindings.
If a logic variable appears more than once within a single unify clause, the results are constrained such that the logic variable has the same value everywhere it’s used.
This has the effect of imposing 'join conditions' over the inputs.
-
In this case, we re-use the
user-id
logic variable to indicate that the:xt/id
from the:users
table should be matched with the:author-id
of the:articles
table.(unify (from :users [{:xt/id user-id} first-name last-name]) (from :articles [{:author-id user-id} title content]))
SELECT u._id AS user_id, u.first_name, u.last_name, a.title, a.content FROM users u JOIN articles a ON u._id = a.author_id
-
For non-equality cases, we can use a
where
clause (where we have a full SQL-inspired expression standard library at our disposal);; 'find me all the users who are the same age' (unify (from :users [{:xt/id uid1} age]) (from :users [{:xt/id uid2} age]) (where (<> uid1 uid2)))
SELECT u1._id AS uid1, u2._id AS uid2, u1.age FROM users u1 JOIN users u2 ON (u1.age = u2.age) WHERE u1._id <> u2._id
-
We can specify that a certain match is optional using
left-join
:(-> (unify (from :customers [{:xt/id cid}]) (left-join (from :orders [{:xt/id oid, :customer-id cid} currency order-value]) [cid currency order-value])) (limit 100))
SELECT c._id AS cid, o.currency, o.order_value FROM customers c LEFT JOIN orders o ON (c._id = o.customer_id) LIMIT 100
Here, we’re asking to additionally return customers who haven’t yet any orders (for which the order-table columns will be absent in the results).
-
Or, we can specify that we only want to return customers who don’t have any orders, using
not
exists?
:(-> (unify (from :customers [{:xt/id cid}]) (where (not (exists? (from :orders [{:customer-id $cid}]) {:args [cid]})))) (limit 100))
SELECT _id AS cid FROM customers c WHERE _id NOT IN (SELECT orders.customer_id FROM orders) LIMIT 100
The unify
operator accepts 'unify clauses' - e.g. from
, where
, with
, join
, left-join
- a full list of which can be found in the unify clause reference guide.
Projections
-
We can create new columns from old ones using
with
:(-> (from :users [first-name last-name]) (with {:full-name (concat first-name " " last-name)}))
SELECT first_name, last_name, (first_name || ' ' || last_name) AS full_name FROM users AS u
-
Where
with
adds to the available columns,return
only yields the specified columns to the next operation:(-> (unify (from :users [{:xt/id user-id} first-name last-name]) (from :articles [{:author-id user-id} title content])) (return {:full-name (concat first-name " " last-name)} title content))
SELECT (u.first_name || ' ' || u.last_name) AS full_name, a.title, a.content FROM users AS u JOIN articles a ON u._id = a.author_id
-
Where we don’t need any additional projections, we can use
without
:(-> (unify (from :users [{:xt/id user-id} first-name last-name]) (from :articles [{:author-id user-id} title content])) (without :user-id))
SELECT u.first_name, u.last_name, a.title, a.content FROM users AS u JOIN articles a ON u._id = a.author_id
Aggregations
To count/sum/average values, we use aggregate
:
(-> (unify (from :customers [{:xt/id cid}])
(left-join (from :orders [{:xt/id oid :customer-id cid} currency order-value])
[oid cid currency order-value]))
(aggregate cid currency
{:order-count (count oid)
:total-value (sum order-value)})
(with {:total-value (coalesce total-value 0)})
(order-by {:val total-value :dir :desc})
(limit 100))
SELECT c._id AS cid, o.currency, COUNT(o._id) AS order_count, COALESCE(SUM(o.order_value), 0) AS total_value
FROM customers c
LEFT JOIN orders o ON (c._id = o.customer_id)
GROUP BY c._id, o.currency
ORDER BY total_value DESC
LIMIT 100
'Pull'
When we’ve found the documents we’re interested in, it’s common to then want a tree of related information. For example, if a user is reading an article, we might also want to show them details about the author as well as any comments.
(Users of existing EDN Datalog databases may already be familiar with 'pull' - in XTQL, because subqueries are a first-class concept, we rely on extensively on these to express a more powerful/composable behaviour.)
(-> (from :articles [{:xt/id article-id} title content author-id])
(with {:author (pull (from :authors [{:xt/id $author-id} first-name last-name])
{:args [author-id]})
:comments (pull* (-> (from :comments [{:article-id $article-id} created-at comment])
(order-by {:val created-at :dir :desc})
(limit 10))
{:args [article-id]})}))
;; => [{:title "...", :content "...",
;; :author {:first-name "...", :last-name "..."}
;; :comments [{:comment "...", :name "..."}, ...]}]
-- using XTDB's 'NEST_ONE'/'NEST_MANY'
FROM articles AS a
SELECT _id AS article_id, title, content, author_id,
NEST_ONE(FROM authors WHERE _id = a.author_id
SELECT first_name, last_name)
AS author,
NEST_MANY(FROM comments WHERE article_id = a._id
SELECT created_at, comment
ORDER BY created_at DESC
LIMIT 10)
AS comments
In this example, we use pull
to pull back a single map - we know that there’s only one author per article (in our system).
When it’s a one-to-many relationship, we use pull*
- this returns any matches in a vector.
Also note that, because we have the full power of subqueries, we can express requirements like 'only get me the most recent 10 comments' using ordinary query operations, without any support within pull
itself.
Bitemporality
It wouldn’t be XTDB without bitemporality, of course - indeed, some may be wondering how I’ve gotten this far without mentioning it!
(I’ll assume you’re roughly familiar with bitemporality for this section. If not, forgive me - we’ll follow this up with more XTDB 2.x bitemporality content soon!)
-
In XTDB 1.x, queries had to be 'point-in-time' - you had to pick a single valid/transaction time for the whole query.
In XTQL, while there are sensible defaults set for the whole query, you can override this on a per-
from
basis by wrapping the table name in a vector and providing temporal parameters:(from :users {:for-valid-time (at #inst "2020-01-01") :bind [first-name last-name]}) (from :users {:for-valid-time :all-time :bind [first-name last-name]})
SELECT first_name, last_name FROM users FOR VALID_TIME AS OF DATE '2020-01-01' SELECT first_name, last_name FROM users FOR ALL VALID_TIME
-
You can also specify
(from <time>)
,(to <time>)
or(in <from-time> <to-time>)
, to give fine-grained, in-query control over the history returned for the given rows. -
System time (formerly 'transaction time', renamed for consistency with SQL:2011) is filtered in the same map with
:for-system-time
.
-
-
This means that you can (for example) query the same table at two points-in-time in the same query - 'who worked here in both 2018 and 2023':
(unify (from :users {:for-valid-time (at #inst "2018") :bind [{:xt/id user-id}]}) (from :users {:for-valid-time (at #inst "2023") :bind [{:xt/id user-id}]}))
DML
XTQL can also be used to write to XTDB using XTQL DML.
It uses the same query language as above, with a small wrapper for each of the operations. We’re hoping that a reasonable proportion of use-cases that previously required transaction functions to be installed and invoked can now submit DML operations instead.
Insert
We submit insert-into
operations to xt/submit-tx
.
insert-into
accepts a query that inserts every result into the given table:
(xt/submit-tx node
[[:insert-into :users
'(from :old-users [xt/id {:given-name first-name, :surname last-name}
xt/valid-from xt/valid-to])]])
-- we omit the submission boilerplate in the SQL equivalents
INSERT INTO users
SELECT _id, given_name AS first_name, surname AS last_name,
_valid_from, _valid_to
FROM old_users
Here we’re preserving the valid-from and valid-to of the rows we’re copying, but we could just as easily specify/calculate a different value.
Naturally, substitute your query of choice from above for the from
(unify
, for example) for an arbitrarily powerful XTQL command.
(As in XT2 SQL, XTQL insert-into
behaves more like an upsert - if a document with that xt/id
already exists, it will be overwritten for the specified period of valid-time.)
Delete
We can delete documents using queries as well.
-
For example, to delete all of the comments on a given post, we can submit the following XTQL:
(defn delete-a-post [node the-post-id] (xt/submit-tx node [[:delete {:from :comments, :bind '[{:post-id $post-id}]} {:post-id the-post-id}]]))
DELETE FROM comments WHERE post_id = ?
Here, we’re passing a dynamic parameter to specify the post to delete.
-
delete
can also take a collection of extraunify
clauses. Let’s say instead we wanted to delete all comments on posts by a certain author - we can do that as follows:(xt/submit-tx node [[:delete '{:from :comments :bind [{:post-id pid}] :unify [(from :posts [{:xt/id pid, :author-id $author}])]} {:author "ivan"}]])
DELETE FROM comments WHERE post_id IN (SELECT _id FROM posts WHERE author_id = ?)
-
We can specify a valid-time range to delete using
for-valid-time
, in a similar manner to thefrom
clause.For example, if we want to take down all Christmas promotions on the 26th December (but we don’t want to be in the office to actually delete them), we could run the following:
(xt/submit-tx node [[:delete '{:from :promotions :for-valid-time (from #inst "2023-12-26") :bind [{:promotion-type "christmas"}]}]])
DELETE FROM promotions FOR PORTION OF VALID_TIME FROM DATE '2023-12-26' TO END_OF_TIME WHERE promotion_type = 'christmas'
-
Finally, we can delete every entry in a table by omitting any
:bind
predicates:(xt/submit-tx node [[:delete {:from :comments}]])
DELETE FROM comments
Update
-
Update operations are almost exactly the same as deletes, except - well, they update the document rather than deleting it, obviously - but they additionally take a
set
clause.(xt/submit-tx node [[:update '{:table :documents :bind [{:xt/id $doc-id, :version v}] :set {:version (+ v 1)}} {:doc-id "doc-id"}]])
UPDATE documents SET version = version + 1 WHERE _id = ?
-
Like deletes, updates also accept
:for-valid-time
and extra sub-queries to join from. You can, for example, copy a value from another related table, or even update a denormalised value:(xt/submit-tx node [[:put-docs :comments {:xt/id (random-uuid), :post-id "my-post-id"}] [:update '{:table :posts :bind [{:xt/id $post-id}] :unify [(with {cc (q (-> (from :comments [{:post-id $post-id}]) (aggregate {:cc (row-count)})))})] :set {:comment-count cc}} {:post-id "my-post-id"}]])
INSERT INTO comments (_id, post_id) VALUES (?, ?) UPDATE posts AS p SET comment_count = (SELECT COUNT(*) FROM comments WHERE post_id = ?) WHERE post_id = ?
This is the operation that we envisage replacing a lot of XT1 transaction functions - it makes it trivial to patch a single value within a document, for example.
Erase
We can irretrievably erase a document using an erase
query.
-
Erases also look the same as deletes, but these don’t support
:for-valid-time
:(xt/submit-tx node [[:erase {:from :users, :bind '[{:email "jms@example.com"}]}]])
ERASE FROM users WHERE email = 'jms@example.com'
Assert
assert-exists
and assert-not-exists
can be used to assert the state of the database during a transaction.
If an assertion fails, the whole transaction rolls back.
This can be used to check pre-conditions, for example.
This query asserts that no user has the email james@example.com
before inserting a user that does:
(xt/submit-tx node
[[:assert-not-exists '(from :users [{:email $email}])
{:email "james@example.com"}]
[:put-docs :users {:xt/id :james, :email "james@example.com"}]])
-- not implemented yet
ASSERT NOT EXISTS (SELECT 1 FROM users WHERE email = 'james@example.com')
You can check the xt/txs
table to see whether and why a transaction failed:
(xt/q node '(from :xt/txs [{:xt/id $tx-id, :committed committed?} error])
{:args {:tx-id my-tx-id}})
;; =>
[{:committed? false
:error (err/runtime-err :xtdb/assert-failed {::err/message "Assert failed"})}]
Those familiar with XT1 might recognise this as ::xt/match
- albeit more powerful because you have the full query language available rather than just matching a whole document.
For more information
Congratulations - this is the majority of the theory behind XTQL! You now understand the fundamentals behind how to construct XTQL queries from its simple building blocks - from here, it’s much more about incrementally learning what each individual operator does, and what it looks like in your client language.
You can:
-
check out the reference guides for XTQL queries and transactions.
We’re very much in listening mode right now - as a keen early adopter, we’d love to hear your first impressions, thoughts and opinions on where we’re headed with XTQL. Please do get in touch via the usual channels!