What is XTDB?

XTDB is a ‘bitemporal’ and ‘dynamic’ relational database for handling regulated data. XTDB is a transactional system designed for powering applications while also being amenable to analytical querying thanks to its internal columnar architecture built on Apache Arrow. XTDB is open source and runs on the JVM.

Bitemporal versioning made easy

XTDB tracks both the system time when data is inserted (or UPDATE-d) into the database, and also the valid time periods that define exactly when a given row/record/document is considered valid/effective in your application. This combination of system and valid time dimensions is called “bitemporality” and in XTDB all data is bitemporal without having to think about storing or updating additional columns. All data is time-versioned automatically.

This system-maintained time-versioning allows application queries to easily access the correct state of the entire application history “as-at” any given moment, and to trivially audit all changes to the database. In other words, this unlocks the complete history of data for rich analysis and allows applications to cope with out of order arrival of information, including corrections to past data while maintaining a general sense of immutability.

XTDB’s approach to temporality is inspired by SQL:2011, but makes it ubiquitous, practical and transparent during day-to-day development. All tables include 4 temporal columns by default which are maintained automatically. However queries are assumed to query ‘now’ unless otherwise specified. Non-valid historical data is filtered out during low-level processing at the heart of the internal design.

Transactional columnar architecture

Unlike most transactional database systems, XTDB implements a columnar data architecture that “separates storage and compute” - this modern, Big-Data-inspired architecture is built around Apache Arrow and commodity object storage (e.g. S3). Most importantly, this design reduces operational costs when retaining large volumes of historical data.

Transaction processing is strictly serial and strongly consistent (ACID), based on deterministic ordering of non-interactive transactions. Within a cluster, transactions for each database are processed by a single leader node for that database, which produces an indexed output that the other nodes follow — so writes for a given database happen once, on a single thread, and reads scale out across the cluster. This design implies a hard upper limit on transaction throughput, but the key advantage is the concrete information guarantees about exactly when, how & why data across the database has changed.

Dynamic relational engine

The columnar engine within XTDB is able to handle “documents” as wide rows in sparse tables, where any given value in a column may contain arbitrarily nested data without any need for upfront schema design. The full range of built-in types is supported within these nested structures (i.e. unlike JSONB). This enables developers to easily use XTDB either as a store of loosely structured documents, or as a more traditional normalized database, or both at the same time.

Unlike typical SQL tables with row-oriented storage, XTDB’s columnar tables are always ‘sparse’ (storing NULLs is cheap) and ‘wide’ (storing lots of columns is efficient).

Both SQL and ‘XTQL’

XTDB offers two interoperable query languages - one for reach (SQL) and one for developer productivity (XTQL). SQL in XTDB is a first-class citizen, built to reflect the SQL:2011 standard (which first introduced bitemporal capabilities to the SQL standard) and conforms to a broad suite of SQLite Logic Tests.

XTQL is a novel relational database language that extends the power of SQL and its standard library to a more composable format that can be written or generated by client libraries using a JSON API.

The two languages are able to interoperate with 100% parity, meaning application developers can use the APIs as they see fit without sacrificing analytical requirements or compromising on functionality.

Feature Highlights

Supports the full spectrum between normalized relational modeling and dynamic document-like storage without compromising data type fidelity (i.e. unlike JSONB).
The combination of a native SQL implementation alongside XTQL offers a more productive application development experience without sacrificing rich data analysis (and without ETL to another system).
Strong data consistency built around linearized, single-writer transaction processing.
Accurate and immutable temporal record versioning to mitigate the complexities of application logic and handle out-of-order data ingestion.
Apache Arrow unlocks data for external integration.
Advanced temporal querying allows you to analyze the evolution of your data.
Deploy across your choice of cloud database services or on-premise to meet reliability and redundancy requirements.

Through each of these interconnected principles and features XTDB solves the motivating problems in a single, coherent system.