Configuration
Changelog (last updated v2.2)
- v2.2:
remotesreplaceslogClusters -
Named connections to external systems are now configured under
remotes.Previously these lived under
logClusters, which was scoped to transaction-log clusters.remotesgeneralises it to any external connection — Kafka clusters, cloud identities, Postgres databases, etc.logClustersis deprecated but still honoured: entries under both names are merged, so existing config keeps working — rename toremoteswhen convenient.The pre-existing
!Kafkacan simply be moved underremoteswith no other changes. - v2.1: multi-database support
-
The log and storage configurations were changed as part of 2.1’s multi-db support.
For more details on those changes, see the Transaction Logs and Object Storage documentation.
XTDB nodes are configured using YAML files.
All config options have default values, it is therefore valid to not specify a config file or not specify any part of the top-level config.
Log & Storage
Section titled “Log & Storage”The two main pluggable components of XTDB are transaction logs and object storage.
## transaction log configurationlog: !Local path: /path/to/log-file
## object store configurationstorage: !Local path: /path/to/storage-dirBy default XTDB will use an in-memory transaction log and an in-memory object store.
Monitoring & Observability
Section titled “Monitoring & Observability”XTDB provides a suite of tools & templates to facilitate Monitoring & Observability.
By default healthz, monitoring & tracing are disabled.
Authentication
Section titled “Authentication”The Postgres wire-compatible server supports authentication which can be configured via authentication rules.
By default a single root user named xtdb accepting any password is configured.
Caching
Section titled “Caching”XTDB has two caches for object store data:
# By default configured to use half the JVM's maximum direct memorymemoryCache: # Maximum size of the cache, in bytes. # Defaults to being calculated via `maxSizeRatio` maxSizeBytes: 1073741824
# Maximum size of the cache, as a proportion of the JVM's maximum direct memory. # Ignored when `maxSizeBytes` is set. maxSizeRatio: 0.5 # default
# Required when using a remote object store, otherwise unused# Defaults to being disableddiskCache: # Directory in which to store cached files. Required. path: /path/to/disk-cache
# Maximum size of the cache, in bytes. # Defaults to being calculated via `maxSizeRatio` maxSizeBytes: 10737418240
# Maximum size of the cache, as a proportion of the filesystem's total space. # Ignored when `maxSizeBytes` is set. maxSizeRatio: 0.75 # defaultRemotes
Section titled “Remotes”remotes is a registry of named connections to the external systems XTDB authenticates against — Postgres instances, Kafka clusters, cloud identities (AWS/Azure/GCP), etc.
You define a connection once here under an alias, then reference it by that alias from the parts of the config that use it — the transaction log, object storage, and external sources.
Each entry maps an alias to a connection. The tag (!Postgres, !Kafka, …) selects the remote type, and the available fields depend on that type:
remotes: my-kafka: !Kafka bootstrapServers: "localhost:9092"
# ...then referenced by alias, e.g. by a Kafka transaction log:log: !Kafka cluster: my-kafka topic: xtdb_topicFor the fields each remote type takes, and how a given component references its alias, see that component’s documentation — e.g. transaction log, object storage.
Troubleshooting configuration
Section titled “Troubleshooting configuration”Config options to help an operator troubleshoot XTDB.
Read-only Databases
Section titled “Read-only Databases”Set all databases in the cluster to be in read-only mode.
By default set to false to have databases run in their configured modes.
## Set to true to have *all* databases run in read-only modereadOnlyDatabases: trueSkip Databases
Section titled “Skip Databases”Set the configured databases as dormant (queriable but not accepting transactions).
By default an empty list or the result of splitting !Env XTDB_SKIP_DBS by ,.
skipDbs: - my_db - my_other_dbOther configuration
Section titled “Other configuration”Postgres wire-compatible server
Section titled “Postgres wire-compatible server”By default read-write Postgres wire-compatible server is started on localhost:5432.
server: # Host on which to start a read-write Postgres wire-compatible server. # # Default is "localhost", which means the server will only accept connections on the loopback interface. # Set to '*' to accept connections on all interfaces. host: localhost
# Port on which to start a read-write Postgres wire-compatible server. # # Default is 0, to have the server choose an available port. # (In the XTDB Docker images, this is defaulted to 5432.) # Set to -1 to not start a read-write server. port: 0
# Port on which to start a read-only Postgres wire-compatible server. # # The server on this port will reject any attempted DML/DDL, # regardless of whether the user would otherwise have the permission to do so. # # Default is -1, to not start a read-only server. # Set to 0 to have the server choose an available port. readOnlyPort: -1Flight SQL Server
Section titled “Flight SQL Server”By default a Flight SQL server is started on localhost:9832.
flightSql: # Host on which to start the Flight SQL server. # # Default is "127.0.0.1", which means the server will only accept connections on the loopback interface. # Set to '*' to accept connections on all interfaces. host: 127.0.0.1
# Port on which to start the FLight SQL server. # # Default is 0, to have the server choose an available port. # (In the XTDB Docker images, this is defaulted to 9832.) # Set to -1 to not start a Flight SQL server. port: 0Compactor
Section titled “Compactor”Defaults to running on roughly half the number of threads the system has processor cores.
compactor: # Number of threads to use for compaction.
# Defaults to !Env XTDB_COMPACTOR_THREADS # or if that's not specified min(availableProcessors / 2, 1). # Set to 0 to disable the compactor. threads: 4Indexer
Section titled “Indexer”Responsible for indexing transactions from the transaction log.
Please consider reaching out at hello@xtdb.com if you feel the need to change any of these!
indexer: # Set to false to disable indexing on the primary database (xtdb). # # Transactions are still accepted onto the log but are never processed, # so synchronous submits will hang waiting for a result that never arrives. # Submit with `async=true` to not hang. enabled: true # default
# Number of operations the in-memory live-index buffers before reorganising them. # Low-level tuning, most deployments leave this alone. logLimit: 64 # default
# The maximum size of a page in the in-memory live-index. # Low-level tuning, most deployments leave this alone. pageLimit: 1024 # default
# Number of operations the in-memory live-index buffers before flushing to the object store. # Low-level tuning, most deployments leave this alone. rowsPerBlock: 102400 # default
# ISO-8601 duration after which the current block is finished even if it # hasn't reached `rowsPerBlock` flushDuration: PT4H # default
# Transaction ids to skip during indexing # Useful to work around a transaction that crashes the indexer. # # Applies to *all* databases on the node. # # Defaults to the `XTDB_SKIP_TXS` environment variable (a comma-separated list # of transaction ids, e.g. "12,15,16") if set, otherwise empty. skipTxs: []Garbage Collector
Section titled “Garbage Collector”Reclaims object-store space by deleting files left behind once compaction has superseded them.
Disabled by default.
garbageCollector: # Set to true to enable garbage collection enabled: false # default
# Number of recent blocks to retain blocksToKeep: 10 # default
# ISO-8601 duration for which superseded trie files are retained garbageLifetime: PT24H # defaultNode ID
Section titled “Node ID”An identifier for the node. For example, used in metrics and crash logging.
Defaults to !Env XTDB_NODE_ID otherwise is a short random string.
Default TZ
Section titled “Default TZ”Defaults to UTC.
Additional Concepts
Section titled “Additional Concepts”Using !Env
Section titled “Using !Env”For certain keys, we allow the use of environment variables - typically, the keys where we allow this are things that may change location across environments. Generally, they are either “paths” or “strings”.
When specifying a key, you can use the !Env tag to reference an environment variable.
As an example:
storage: !Local path: !Env XTDB_STORAGE_PATHAny key that we allow the use of !Env will be documented as such.
CLI tools/flags
Section titled “CLI tools/flags”Changelog (last updated v2.1)
- v2.1: top-level commands
-
In v2.1, we changed the CLI to use top-level commands (not dissimilar to Git, for example).
Previously, the playground and compact-only nodes were activated using optional flags -
--playground-portand--compact-onlyrespectively.reset-compactorandexport-snapshotwere also added in v2.1.
You can run various tools by passing arguments - either directly to the CLI or via Docker’s arguments:
node(default, can be omitted)-
-f <file>,--file <file>: specifies the configuration file to use.
playground-
Starts a playground - an in-memory server that will accept any database name, creating it if required.
-p <port>,--port <port>(default 5432): specifies the port to run the playground server on.
compactor-
Starts a compactor-only node - useful for giving the compaction process more compute resources.
-f <file>,--file <file>: specifies the configuration file to use.
reset-compactor <db-name>-
Resets the compaction back to L0, deleting any L1+ files - use this if you’ve encountered a compaction bug and need to reset its state.
-
Spin down all of your XT nodes
-
Using your container orchestration tool (e.g. Kubernetes), run a one-shot task with an overriden command:
["reset-compactor"]. Optionally, specify--dry-runto list all of the files to be removed. -
When the tool has finished, spin up your nodes again.
You may want to also spin up a compactor-only node to help out with the re-compaction.
At the moment, this can only reset all the way back to L0 - finer-grained reset will be added in a later release.
-
export-snapshot <db-name>-
-f <file>,--file <file>: specifies the configuration file to use.
This exports a snapshot of the object-store into a sibling directory within the object store. e.g. if your storage is at
s3://my-bucket/, this will export to a directory unders3://my-bucket/exports/...- the exact directory will be given in the logs.You can then start another node against this storage directory - you will need to start a new log, and increase the log epoch in your configuration:
log: !Kafka...topic: new-topicepoch: 1storage: !RemoteobjectStore: !S3bucket: my-bucketprefix: exports/... read-arrow-file <file>-
reads an Arrow file and emits it as EDN
read-arrow-stream-file <file>-
reads an Arrow ‘stream IPC format’ file and emits it as EDN
e.g.
- Dockerfile:
CMD ["playground", "--port", "5439"] - docker-compose:
command: ["playground", "--port", "5439"] - Java uberjar:
java -jar xtdb.jar playground --port 5439 - Clojure (with
xtdb-corein yourdeps.edn):clj -M xtdb.main playground --port 5439
You can also pass --help to any of the commands to get command-specific help.