AWS

XTDB provides modular support for AWS environments, including a pre-built Docker image, integrations for S3 storage and CloudWatch metrics, and configuration options for deploying onto AWS infrastructure.

Note	For more information on setting up an XTDB cluster on AWS, see the "Getting Started with AWS" guide.

Required infrastructure

In order to run an AWS based XTDB cluster, the following infrastructure is required:

An S3 bucket for remote storage.
A Kafka cluster for the message log.
- For more information on setting up Kafka for usage with XTDB, see the Kafka configuration docs.
IAM policies which grant XTDB permission to the S3 bucket
XTDB nodes configured to communicate with the Kafka cluster and S3 bucket.

Terraform Templates

To set up a basic version of the required infrastructure, we provide a set of Terraform templates specifically designed for AWS.

These can be fetched from the XTDB repository using the following command:

terraform init -from-module github.com/xtdb/xtdb.git//aws/terraform

Resources

By default, running the templates will deploy the following infrastructure:

Amazon S3 Storage Bucket for remote storage.
- Configured with associated resources using the terraform-aws-modules/s3-bucket Terraform module.
- Enables object ownership control and applies necessary permissions for XTDB.
IAM Policy for granting access to the S3 storage bucket.
- Configured with associated resources using the terraform-aws-modules/iam-policy Terraform module.
- Grants permissions for XTDB to read, write, and manage objects within the specified S3 bucket.
Virtual Private Cloud (VPC) for the XTDB EKS cluster.
- Configured with associated resources using the terraform-aws-modules/vpc Terraform module.
- Enables DNS resolution, assigns public subnets, and configures networking for the cluster.
Amazon Elastic Kubernetes Service (EKS) Cluster for running XTDB resources.
- Configured with associated resources using the terraform-aws-modules/eks Terraform module.
- Provisions a managed node group dedicated to XTDB workloads.

Configuration

In order to customize the deployment, we provide a number of pre-defined variables within the terraform.tfvars file. These variables can be modified to tailor the infrastructure to your specific needs.

The following variables are required to be set:

s3_bucket_name: The (globally unique) name of the S3 bucket used by XTDB.

For more advanced usage, the Terraform templates themselves can be modified to suit your specific requirements.

Outputs

The Terraform templates will return several outputs:

Output Description

Output	Description
`aws_region`	The AWS region in which the resources were created.
`eks_cluster_name`	The name of the EKS cluster created for the XTDB deployment.
`s3_bucket_name`	The name of the S3 bucket created for the XTDB cluster.
`s3_access_policy_arn`	The ARN of the S3 bucket created for the XTDB cluster.
`oidc_provider`	OpenID Connect identity provider for the EKS cluster.
`oidc_provider_arn`	The ARN of the OpenID Connect identity provider for the EKS cluster.

aws_region

The AWS region in which the resources were created.

eks_cluster_name

The name of the EKS cluster created for the XTDB deployment.

s3_bucket_name

The name of the S3 bucket created for the XTDB cluster.

s3_access_policy_arn

The ARN of the S3 bucket created for the XTDB cluster.

oidc_provider

OpenID Connect identity provider for the EKS cluster.

oidc_provider_arn

The ARN of the OpenID Connect identity provider for the EKS cluster.

`xtdb-aws` Helm Charts

For setting up a production-ready XTDB cluster on AWS, we provide a Helm chart built specifically for AWS environments.

Pre-requisites

To allow the XTDB nodes to access AWS resources, a Kubernetes Service Account (KSA) must be setup and linked with an IAM role that has any necessary permissions, using IAM Roles for Service Accounts (IRSA).

Setting Up the Kubernetes Service Account:

Create the Kubernetes Service Account in the target namespace:

kubectl create serviceaccount xtdb-service-account --namespace xtdb-deployment

Setting up the IAM Service Account

Fetch the ARN of a policy granting access to s3 (s3_access_policy_arn), the OpenID Connect identity provider of the EKS cluster (oidc_provider) and ARN for the OIDC provider (oidc_provider_arn).

Create a file eks_policy_document.json for the trust policy, replacing values as appropriate:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "<oidc_provider_arn>"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "<oidc_provider>:aud": "sts.amazonaws.com",
          "<oidc_provider>:sub": "system:serviceaccount:xtdb-deployment:xtdb-service-account"
        }
      }
    }
  ]
}

Create the IAM role and attach the trust policy created above:

aws iam create-role --role-name xtdb-eks-role --assume-role-policy-document file://eks_policy_document.json --description "XTDB EKS Role"

Attach the S3 bucket role:

aws iam attach-role-policy --role-name xtdb-eks-role --policy-arn=<s3_access_policy_arn>

Annotating the Kubernetes Service Account

Fetch the ARN of the IAM role:

xtdb_eks_role_arn=$(aws iam get-role --role-name xtdb-eks-role --query Role.Arn --output text)

Annotate the Kubernetes Service Account with the IAM role to establish the link between the two:

kubectl annotate serviceaccount xtdb-service-account --namespace xtdb-deployment eks.amazonaws.com/role-arn=$xtdb_eks_role_arn

Installation

The Helm chart can be installed directly from the Github Container Registry releases.

This will use the default configuration for the deployment, setting any required values as needed:

helm install xtdb-aws oci://ghcr.io/xtdb/helm-xtdb-aws \
  --version 2.0.0-snapshot \
  --namespace xtdb-deployment \
  --set xtdbConfig.serviceAccount="xtdb-service-account" \
  --set xtdbConfig.s3Bucket=<s3_bucket>

We provide a number of parameters for configuring numerous parts of the deployment, see the values.yaml file or call helm show values:

helm show values oci://ghcr.io/xtdb/helm-xtdb-aws \
  --version 2.0.0-snapshot

Resources

By default, the following resources are deployed by the Helm chart:

A ConfigMap containing the XTDB YAML configuration.
A StatefulSet containing a configurable number of XTDB nodes, using the xtdb-aws docker image
A LoadBalancer Kubernetes service to expose the XTDB cluster to the internet.

Pulling the Chart Locally

The chart can also be pulled from the Github Container Registry, allowing further configuration of the templates within:

helm pull oci://ghcr.io/xtdb/helm-xtdb-aws \
  --version 2.0.0-snapshot \
  --untar

`xtdb-aws` Docker Image

The xtdb-aws image is optimized for running XTDB in AWS environments, and is deployed on every release to XTDB.

By default, it will use S3 for storage and Kafka for the message log, including dependencies for both.

Configuration

The following environment variables are used to configure the xtdb-aws image:

Variable Description

Variable	Description
`KAFKA_BOOTSTRAP_SERVERS`	Kafka bootstrap server containing the XTDB topics.
`XTDB_LOG_TOPIC`	Kafka topic to be used as the XTDB log.
`XTDB_S3_BUCKET`	Name of the S3 bucket used for remote storage.
`XTDB_NODE_ID`	Persistent node id for labelling Prometheus metrics.

KAFKA_BOOTSTRAP_SERVERS

Kafka bootstrap server containing the XTDB topics.

XTDB_LOG_TOPIC

Kafka topic to be used as the XTDB log.

XTDB_S3_BUCKET

Name of the S3 bucket used for remote storage.

XTDB_NODE_ID

Persistent node id for labelling Prometheus metrics.

You can also set the XTDB log level using environment variables.

Using a Custom Node Configuration

For advanced usage, XTDB allows the above YAML configuration to be overridden to customize the running node’s system/modules.

In order to override the default configuration:

Mount a custom YAML configuration file to the container.
Override the COMMAND of the docker container to use the custom configuration file, ie:
```
CMD ["-f", "/path/to/custom-config.yaml"]
```

S3 Storage

Amazon S3 can be used as a shared object-store for XTDB’s remote storage module.

Infrastructure Requirements

To use S3 as the object store, the following infrastructure is required:

An S3 bucket.

IAM policies which grant XTDB permission to the S3 bucket:

Statement:
- Effect: Allow
  Action:
    - 's3:GetObject'
    - 's3:PutObject'
    - 's3:DeleteObject'
    - 's3:ListBucket'
    - 's3:AbortMultipartUpload'
    - 's3:ListBucketMultipartUploads'
  Resource:
    - !Ref S3BucketArn
    - !Join [ '', [ !Ref S3BucketArn, '/*'] ]

If you are using an S3 compatible object storage you might need to pass the environment variable AWS_S3_FORCE_PATH_STYLE=true, because alternative S3 solutions often still use the older S3 path style.

Authentication

XTDB uses AWS SDK for Authentication, relying on the default AWS credential provider chain. See the AWS documentation for setup instructions.

Configuration

To use the S3 module, include the following in your node configuration:

databases:
  xtdb:
    storage: !Remote
      objectStore: !S3
        ## -- required

        ## The name of the S3 bucket to use for the object store
        ## (Can be set as an !Env value)
        bucket: "my-s3-bucket"

        ## -- optional

        ## A file path to prefix all of your files with
        ## - for example, if "foo" is provided, all XTDB files will be located under a "foo" sub-directory
        ## (Can be set as an !Env value)
        # prefix: my-xtdb-node

        ## Basic credentials for AWS.
        ## If not provided, will default to AWS's standard credential resolution.
        ## see: https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/credentials-chain.html
        # credentials:
        #   accessKey: "..."
        #   secretKey: "..."

        ## Endpoint URI
        ## If not provided, will default to the standard S3 endpoint for the resolved region.
        # endpoint: "https://..."

# -- required
# A local disk path where XTDB can cache files from the remote storage
diskCache:
  path: /var/cache/xtdb/object-store

If configured as an in-process node, you can also specify an S3Configurator instance - this is used to modify the requests sent to S3.

Protecting XTDB Data

Amazon S3 provides strong durability guarantees (11 9s), but does not protect against operator error or access misconfiguration.

To minimize risk:

Enable S3 Versioning — allows recovery of deleted or overwritten objects
- Will use delete markers or retention policies for soft delete
Use Cross-Region Replication for disaster recovery scenarios
Apply S3 bucket lifecycle and retention policies with care
Lock down IAM access to prevent destructive operations from untrusted sources

For shared guidance on storage backup strategies, see the Backup Overview.

Backing Up XTDB Data

XTDB storage files in S3 are immutable and ideally suited for snapshot-based backup strategies.

To perform a full backup:

Back up the entire S3 prefix (or bucket) used by XTDB
Ensure all files associated with the latest flushed block are present
Avoid copying in-progress files — only finalized storage files are valid for recovery

You can use AWS Backup for scheduled, versioning-aware backups of entire buckets.

CloudWatch Monitoring

XTDB supports reporting metrics to AWS Cloudwatch for performance and health monitoring.

Configuration

To report XTDB node metrics to CloudWatch, include the following in your node configuration:

modules:
  - !CloudWatch

Authentication is handled via the AWS SDK, using the default AWS credential provider chain. See the AWS documentation for setup instructions.

The associated credentials must have permissions to write metrics to a pre-configured CloudWatch namespace.

AWS

Required infrastructure

Terraform Templates

Resources

Configuration

Outputs

xtdb-aws Helm Charts

Pre-requisites

Setting Up the Kubernetes Service Account:

Setting up the IAM Service Account

Annotating the Kubernetes Service Account

Installation

Resources

Pulling the Chart Locally

xtdb-aws Docker Image

Configuration

Using a Custom Node Configuration

S3 Storage

Infrastructure Requirements

Authentication

Configuration

Protecting XTDB Data

Backing Up XTDB Data

CloudWatch Monitoring

Configuration

`xtdb-aws` Helm Charts

`xtdb-aws` Docker Image