Skip to content

SQL Quickstart

The Basics

Insert a row into a new table

2024-01-01
Run
Open in xt-play

Things to note:

  • Tables in XTDB may be created dynamically during INSERT, where all columns and types are inferred automatically.
  • The only schema requirement is that every table in XTDB requires a user-provided _id primary key column, but all other columns are optional and dynamically typed by default. This means each row in XTDB can offer the flexibility of a document in a document database.
  • The _ prefix convention (e.g. _id) is used for reserved columns and tables that XTDB handles automatically. For full details see How XTDB works.

Query for that same row

Querying this data back again is a simple matter of:

Run
Open in xt-play

Evolve the table

If we now INSERT another row with a slightly different shape, we can see that XTDB automatically handles the implicitly extended schema and allows us to return both rows:

2024-01-02
Run
Open in xt-play

Handling ‘documents’

XTDB is designed to work with JSON-like nested data as a first-class concept (i.e. not restricted to JSON or JSONB types). This means you can easily handle complex document-like nested data. Here the RECORDS syntax, combined with the ‘upsert’ behavior of the INSERT, avoids the chore of having to be explicit about all the individual columns involved:

2024-01-03
Run
Open in xt-play

You can then query this nested data intuitively:

Run
Open in xt-play

You can also observe the inferred schema using the SQL standard’s INFORMATION_SCHEMA facilities:

Run
Open in xt-play

Capturing fast-changing data may be powerful, but what if we made a mistake somewhere and wanted to undo a change? The thing that makes XTDB most interesting is the approach to immutability and time-travel…

A log of transactions

XTDB provides a Postgres wire protocol compatibility endpoint that enables developers to re-use many existing tools and drivers that have been built for connecting to real Postgres servers.

A key distinction between interacting with Postgres and interacting with XTDB (e.g. using psql or otherwise) is that all clients connected to XTDB operate in a ‘stateless’ manner that:

  1. forces all writes to be fully serialized into a single, system-wide log of durably-recorded transactions and, consequently,
  2. precludes the use of ‘interactive transactions’ (i.e. clients can’t run queries in the middle of multi-statement transactions)

The complete transaction log history is permanently stored within the system-maintained xt.txs table:

Run
Open in xt-play

XTDB relies on the immutable, append-only nature of the transaction log and its timestamps to automatically record all changes across the database. This approach means that queries can use the reliable ordering of the log to query previous states of the database using nothing more than a timestamp. More on that next!

Query the past

Basis: re-run queries against past states without explicit snapshots

Unlike in a typical SQL database, UPDATE and DELETE operations in XTDB are non-destructive, meaning previous versions of records are retained automatically and previous states of the entire database can be readily accessed.

A query like SELECT * FROM people will only show the ‘current’ version of everything by default.

Let’s try deleting fred from the database…

2024-01-04
Run
Open in xt-play

Despite having seemingly deleted the latest version of the fred record, the prior two versions are not lost and can be retrieved using a couple of methods.

The simplest way to observe the prior version of the fred record is to re-run the exact same query against an earlier ‘basis’.

Run
Open in xt-play

Here we set the basis for the whole query by adjusting DEFAULT SYSTEM_TIME. This allows us to view exactly what the database looked like at the beginning of 2020. The mechanics of SYSTEM_TIME are discussed in the next section, but they underpin the notion of basis.

As it happens, the database had no records in 2020 so no results are returned.

Now try setting the basis to a point in time shortly after our initial transaction:

Run
Open in xt-play

We can see that on the 1st (our first INSERT) we have our original fred row, but try changing the date to the 2nd and the bob row appears!

The concept of basis makes querying consistently across a scaled-out cluster of read replicas very simple. Independent applications can use the same basis to observe the same database state, regardless of which XTDB node they are connected to. For more information on the implications of the log-oriented design, see How XTDB works or take a look at this blog post by LinkedIn Engineering.

System-Time Columns: automatic time-versioning of rows without audit tables

The mechanism underpinning the basis concept is called ‘System Time’. This is what ensures that changes to data in XTDB are immutable, so you always have access to prior states.

The SQL:2011 model of System Time and ‘temporal tables’ is baked into the core design of XTDB, and has been simplified such that you don’t need to learn new syntax or clutter your SQL to take advantage of the immutability benefits.

This means that all INSERT, UPDATE and DELETEs are automatically versioned - you can write SQL intuitively and never lose data again!

You can avoid ever needing to reach for backups or ETL integrations with data warehousing systems in order to recover or make use of previous data. It also helps avoid complicating application schemas with things like “soft delete” columns, audit tables and append-only tables.

The built-in system-time columns _system_from and _system_to are hidden from view by default but, when specified, can be accessed on every table using regular SQL:

Run
Open in xt-play

_system_from can take the place of the modified_at columns found across many application schemas. More details about these columns and how they are maintained can be found in How XTDB Works.

The full system-time history for a set of records in a table can be retrieved by specifying FOR SYSTEM_TIME ALL after the table reference:

Run
Open in xt-play

You can also run queries against individual tables at specific timestamps using FOR SYSTEM_TIME AS_OF <timestamp>, use temporal period operators (OVERLAPS, PRECEDES etc.) to understanding how data has change over time, and much more - see the SQL reference documentation.

Here are some useful capabilities these temporal eatures enable…

A delta of changes to a table since a given system-time

Run
Open in xt-play

Restore a deleted row

Because XTDB retains history, the regular SQL DELETE statement is essentially performing a ‘soft delete’ (“An operation in which a flag is used to mark data as unusable, without erasing the data itself from the database” …but here it’s first-class and ubiquitous).

Here’s how we can bring fred back to being visible and active in our database:

2024-01-05
Run
Open in xt-play

ERASE as ‘hard’ delete

Sometimes you do really want to forget the past though, and for circumstances where data does need to be erased (“hard deleted”), an ERASE operation is provided:

2024-01-06
Run
Open in xt-play

The ERASE is effective as soon as the transaction is committed - no longer accessible to an application - and under the hood the relevant data is guaranteed to be fully erased only once all background index processing has completed and the changes have been written to the remote object storage.

Your basic training is almost complete!

With everything covered so far, you are already well-versed in the main benefits of XTDB.

Really there is only one more topic left to examine before you are familiar with all the novel SQL functionality XTDB has to offer…

Control the timeline

Everything demonstrated so far only scratches the surface of what XTDB can do, given that XTDB is a full SQL implementation with all the implications that has, however there is one further aspect where XTDB is very different to most databases: ubiquitous ‘Valid-Time’ versioning.

Valid-Time is for advanced time-travel

In addition to system-time versioning, SQL:2011 also defines ‘application-time’ versioning. XTDB applies this versioning to all tables and refers to it as valid-time.

Valid-time is a key tool for developers who need to offer time-travel functionality within their applications. It is a rigourously defined model that can help avoid cluttering schemas and queries with bespoke updated_at, deleted_at and effective_from columns (…and all the various TRIGGERs that typically live alongside those).

Developers who try to build useful functionality on top of system-time directly will likely encounter issues with migrations, backfill, and out-of-order ingestion. Valid-time solves these challenges head-on whilst also enabling other advanced usage scenarios:

  • corrections - curate a timeline of versions with an ability to correct data - an essential capability for applications where recording the full context behind critical decisions is needed
  • global scheduling - control exactly when data is visible to as-of-now queries by loading data with future valid-time timestamps, without needing to complicate your schema or queries - data can be orchestrated to ‘appear’ and ‘disappear’ automatically as wall-clock time progresses

Note that valid-time as provided by XTDB is specifically about the validity (or “effective from” time) of a given row in the table, and not necessarily some other domain conception of time (unless you carefully model it 1:1).

Let’s have a glimpse of what can you do with SQL to make use of valid-time…

INSERT into the past

We can specify the _valid_from column during an INSERT statement to record when the organization (i.e. thinking beyond this particular database!) first became aware of the person carol:

2024-01-07
Run
Open in xt-play

What did you know?

With backdated information now correctly loaded into XTDB, we can easily verify that we knew carol existed in the company records at a time before our current database was even created:

Run
Open in xt-play

When did you know it?

The ‘bitemporal’ combination of valid-time and system-time columns allows us to readily produce an auditable history about what we claimed to have known in the past:

2024-01-08
Run
Open in xt-play

“Please re-run yesterday’s report using today’s data”

Perhaps most importantly for many applications, we can easily produce and later re-produce correct reports against business-relevant timestamps without having to assemble wildly complex queries or maintain unnecessary ETL infrastructure:

Run
Open in xt-play

Summary

XTDB implements a SQL API that closely follows the ISO standard specifications and draws inspiration from Postgres where needed, however unlike most SQL systems XTDB:

  • does not require an explicit schema to be declared before inserting data (i.e. there’s no explicit CREATE TABLE DDL statement) - tables may be created dynamically via an initial INSERT statement, along with any supplied columns and inferrable type information - schema automatically evolves over time as data changes
  • handles semi-structured ‘document’ data natively, with deeply nested union types (‘objects’) and arrays
  • operates on top of a durable transaction log to help underpin scalable and reliable information systems
  • maintains various ‘bitemporal’ columns globally across all tables to preserve history
  • offers powerful temporal query syntax for rich analysis of historical data

Next steps!

You have now learned the essentials of using XTDB!

Looking for more? Please have a browse around, try building something, and feel very welcome to say hello 👋