# Setup SSO profile
aws configure sso
# Authenticate with AWS SSO
aws sso login --profile <profile_name>
Setting up a cluster on AWS
This guide will walk you through the process of configuring and running an XTDB Cluster on AWS. This setup includes:
-
Using AWS S3 as the remote storage implementation.
-
Utilizing Apache Kafka as the shared message log implementation.
-
Exposing the cluster to the internet via a Postgres wire-compatible server and HTTP.
The required AWS infrastructure is provisioned using Terraform, and the XTDB cluster and it’s resources are deployed on Amazon Elastic Kubernetes Service using Helm.
Although we provide numerous parameters to configure the templates, you are encouraged to edit them, use them as a foundation for more advanced use cases, and reuse existing infrastructure when suitable. These templates serve as a simple starting point for running XTDB on AWS and Kubernetes, and should be adapted to meet your specific needs, especially in production environments.
This guide assumes that you are using the default templates.
Requirements
Before starting, ensure you have the following installed:
-
The AWS CLI - See the Installation Instructions.
-
Terraform - See the Installation Instructions.
-
kubectl - The Kubernetes CLI, used to interact with the AKS cluster. See the Installation Instructions.
-
Helm - The Kubernetes package manager, used to deploy numerous components to AKS. See the Installation Instructions.
On AWS itself, you will need:
-
An AWS account and associated credentials that allow you to create resources.
Authenticating the AWS CLI
Before running the Terraform templates, you need to authenticate the AWS CLI with your AWS account.
See AWS CLI Sign-In Instructions for more information on how to authenticate the AWS CLI - for this guide, we will use aws sso
and profiles.
This allows you to perform necessary operations on AWS via Terraform using the User Principal on the AWS CLI.
Getting started with Terraform
The following assumes that you are authenticated on the AWS CLI, have Terraform installed on your machine, and are located a directory that you wish to use as the root of the Terraform configuration.
First, make the following terraform init
call:
terraform init -from-module github.com/xtdb/xtdb.git//aws/terraform
This will download the Terraform files from the XTDB repository, and initialize the working directory.
Note
|
For the sake of this guide, we store Terraform state locally. However, to persist the state onto AWS, you will need to configure a remote backend using AWS S3. This allows you to share the state file across teams, maintain versioning, and ensure consistency during deployments. For more info, see the Terraform S3 backend documentation. |
What is being deployed on AWS?
The sample Terraform directory sets up a few key components of the infrastructure required by XTDB. If using the default configuration, the following resources will be created:
-
Amazon S3 Storage Bucket for remote storage.
-
Configured with associated resources using the terraform-aws-modules/s3-bucket Terraform module.
-
Enables object ownership control and applies necessary permissions for XTDB.
-
-
IAM Policy for granting access to the S3 storage bucket.
-
Configured with associated resources using the terraform-aws-modules/iam-policy Terraform module.
-
Grants permissions for XTDB to read, write, and manage objects within the specified S3 bucket.
-
-
Virtual Private Cloud (VPC) for the XTDB EKS cluster.
-
Configured with associated resources using the terraform-aws-modules/vpc Terraform module.
-
Enables DNS resolution, assigns public subnets, and configures networking for the cluster.
-
-
Amazon Elastic Kubernetes Service (EKS) Cluster for running XTDB resources.
-
Configured with associated resources using the terraform-aws-modules/eks Terraform module.
-
Provisions a managed node group dedicated to XTDB workloads.
-
This infrastructure provides a solid foundation for deploying XTDB on AWS with Kubernetes. Resource sizes, IAM permissions, and networking configurations can be customized to fit your specific requirements and cost constraints.
Note
|
Later within this guide we shall create an IAM role using the command line that can be assumed by a Kubernetes Service Account on the EKS cluster to access the S3 bucket. Assuming that you setup similar resources on Kubernetes, you may want to manage these using Terraform as well. |
Deploying the AWS Infrastructure
Before creating the Terraform resources, review and update the terraform.tfvars
file to ensure the parameters are correctly set for your environment:
-
You are required to set a unique and valid
s3_bucket_name
for your environment. -
You may also wish to change resource tiers, the location for the resources to be deployed on, or the VM sizes used by the EKS cluster.
Note
|
In this guide, we use AWS Named Profiles to authenticate with AWS, and need to pass the profile to use our Terraform commands. Though we do this via the CLI, you can also add it directly to the terraform provider config or authenticate using other methods - see the AWS Provider Documentation for more information.
|
To get a full list of the resources that will be deployed by the templates, run:
AWS_PROFILE=<profile_name> terraform plan
Finally, to create the resources, run:
AWS_PROFILE=<profile_name> terraform apply
This will create the necessary infrastructure on the AWS account.
Fetching the Terraform Outputs
The Terraform templates will generate several outputs required for setting up the XTDB nodes on the EKS cluster.
To retrieve these outputs, execute the following command:
terraform output
This will return the following outputs:
-
aws_region
- The AWS region in which the resources were created. -
eks_cluster_name
- The name of the EKS cluster created for the XTDB deployment. -
s3_bucket_name
- The name of the S3 bucket created for the XTDB cluster. -
s3_access_policy_arn
- The ARN of the S3 bucket created for the XTDB cluster. -
oidc_provider
- OpenID Connect identity provider for the EKS cluster. -
oidc_provider_arn
- The ARN of the OpenID Connect identity provider for the EKS cluster.
Deploying on Kubernetes
With the infrastructure created on AWS, we can now deploy the XTDB nodes and a simple Kafka instance on the EKS cluster.
Prior to deploying the Kubernetes resources, ensure that the kubectl
CLI is installed and configured to interact with the EKS cluster. Run the following command:
aws eks --profile <profile_name> --region <region> update-kubeconfig --name <eks_cluster_name>
Now that kubectl
is authenticated with the EKS cluster, you can set up the namespace for the XTDB deployment:
kubectl create namespace xtdb-deployment
The EKS cluster is now ready for deployment,
Deploying an example Kafka
To deploy a basic set of Kafka resources within GKE, you can make use of the bitnami/kafka
Helm chart. Run the following command:
helm install kafka oci://registry-1.docker.io/bitnamicharts/kafka \
--version 31.3.1 \
--namespace xtdb-deployment \
--set listeners.client.protocol=PLAINTEXT \
--set listeners.controller.protocol=PLAINTEXT \
--set controller.resourcesPreset=medium \
--set global.defaultStorageClass=gp2 \
--set controller.nodeSelector.node_pool=xtdbpool
This command will create:
-
A simple, unauthenticated Kafka deployment on the GKE cluster, which XTDB will use as its message log, along with its dependent infrastructure and persistent storage.
-
Using gp2 backed storage for the Persistent Volume Claims.
-
-
A Kubernetes service to expose the Kafka instance to the XTDB cluster.
Considerations of the Kafka Deployment
The Kafka instance set up above is for demonstration purposes and is not recommended for production use. This example lacks authentication for the Kafka cluster and allows XTDB to manage Kafka topic creation and configuration itself.
For production environments, consider the following:
-
Use a more robust Kafka deployment.
-
Pre-create the required Kafka topics.
-
Configure XTDB appropriately to interact with the production Kafka setup.
Additional resources:
-
For further configuration options for the Helm chart, refer to the Bitnami Kafka Chart Documentation.
-
For detailed configuration guidance when using Kafka with XTDB, see the XTDB Kafka Setup Documentation.
Verifying the Kafka Deployment
After deployment, verify that the Kafka instance is running properly by checking its status and logs.
To check the status of the Kafka deployment, run the following command:
kubectl get pods --namespace xtdb-deployment
To view the logs of the Kafka deployment, use the command:
kubectl logs -f statefulset/kafka-controller --namespace xtdb-deployment
By verifying the status and reviewing the logs, you can ensure the Kafka instance is correctly deployed and ready for use by XTDB.
Creating an IAM Role for the XTDB nodes
To allow the XTDB nodes to access the S3 bucket created earlier, a Kubernetes Service Account (KSA) must be setup and linked with an IAM role that has the necessary permissions, using IAM Roles for Service Accounts (IRSA).
To set up the Kubernetes Service Account, run the following command:
kubectl create serviceaccount xtdb-service-account --namespace xtdb-deployment
Fetch the S3 bucket policy ARN (s3_access_policy_arn
) and the OpenID Connect identity provider of the EKS cluster (oidc_provider
) along with the ARN for the provider (oidc_provider_arn
) from the Terraform outputs.
Create a file eks_policy_document.json
for the trust policy, replacing values as appropriate:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "<oidc_provider_arn>"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"<oidc_provider>:aud": "sts.amazonaws.com",
"<oidc_provider>:sub": "system:serviceaccount:xtdb-deployment:xtdb-service-account"
}
}
}
]
}
Create the IAM role and attach the trust policy:
aws iam --profile <profile_name> create-role --role-name xtdb-eks-role --assume-role-policy-document file://eks_policy_document.json --description "XTDB EKS Role"
Now attach the S3 bucket role:
aws iam --profile <profile_name> attach-role-policy --role-name xtdb-eks-role --policy-arn=<s3_access_policy_arn>
Fetch the ARN of the IAM role:
xtdb_eks_role_arn=$(aws iam --profile <profile_name> get-role --role-name xtdb-eks-role --query Role.Arn --output text)
Finally, annotate the Kubernetes Service Account with the IAM role:
kubectl annotate serviceaccount xtdb-service-account --namespace xtdb-deployment eks.amazonaws.com/role-arn=$xtdb_eks_role_arn
With the XTDB service account set up, we can now deploy the XTDB cluster to the EKS cluster.
Deploying the XTDB cluster
In order to deploy the XTDB cluster and it’s constituent parts into the AKS cluster, we provide an xtdb-aws
Helm chart/directory.
This can be found on the XTDB Github Container Registry, and can be used directly with helm
commands.
With the values from the Terraform outputs, you can now deploy the XTDB cluster. Run the following command, substituting the values as appropriate:
helm install xtdb-aws oci://ghcr.io/xtdb/helm-xtdb-aws \
--version 2.0.0-snapshot \
--namespace xtdb-deployment \
--set xtdbConfig.serviceAccount="xtdb-service-account" \
--set xtdbConfig.s3Bucket=<s3_bucket>
The following are created by the templates:
-
A
ConfigMap
containing the XTDB YAML configuration. -
A
StatefulSet
containing the XTDB nodes. -
A
LoadBalancer
Kubernetes service to expose the XTDB cluster to the internet.
To check the status of the XTDB statefulset, run:
kubectl get statefulset --namespace xtdb-deployment
To view the logs of each individual StatefulSet member, run:
kubectl logs -f xtdb-statefulset-n --namespace xtdb-deployment
Customizing the XTDB Deployment
The above deployment uses the xtdb-aws
chart defaults, individually setting the terraform outputs as xtdbConfig
settings using the command line.
For more information on the available configuration options and fetching the charts locally for customization, see the xtdb-aws
Helm documentation
Accessing the XTDB Cluster
Note
|
As it will take some time for the XTDB nodes to be marked as ready (as they need to pass their initial startup checks), and AWS will need to provision a Load Balancer for the service, it may take a few minutes for the XTDB cluster to be accessible. |
Once the XTDB cluster is up and running, you can access it via the LoadBalancer service that was created.
To get the external IP of the LoadBalancer service, run:
kubectl get svc xtdb-service --namespace xtdb-deployment
This will return the external IP of the LoadBalancer service. You can use this IP and access the XTDB cluster via:
-
Postgres Wire Server (on port
5432
) -
Healthz Server (on port
8080
) -
HTTP Server (on port
3000
).
To check the status of the XTDB cluster using the HTTP server, run:
curl http://$ExternalIP:8080/healthz/alive
# alternatively `/healthz/started`, `/healthz/ready`
If the above command succeeds, you now have a load-balanced XTDB cluster accessible over the internet.