Skip to content

Setting up a cluster on Azure

This guide will walk you through the process of configuring and running an XTDB Cluster on Azure. This setup includes:

  • Using Azure Blob Storage as the remote storage implementation.

  • Utilizing Apache Kafka as the shared message log implementation.

  • Exposing the cluster to the internet via a Postgres wire-compatible server and HTTP.

The required Azure infrastructure is provisioned using Terraform, and the XTDB cluster and it’s resources are deployed on Azure Managed Kubernetes Service (AKS) using Helm,

Although we provide numerous parameters to configure the templates, you are encouraged to edit them, use them as a foundation for more advanced use cases, and reuse existing infrastructure when suitable. These templates serve as a simple starting point for running XTDB on Azure and Kubernetes, and should be adapted to meet your specific needs, especially in production environments.

This guide assumes that you are using the default templates.

Requirements

Before starting, ensure you have the following installed:

Authenticating the Azure CLI

Within Azure, ensure that you have an existing Subscription, and that you are authenticated with the Azure CLI.

Ensure that your existing Subscription has the necessary resource providers - see this article for more information. This guide requires the following providers:

  • Microsoft.ContainerService - for the AKS resources.

  • Microsoft.ManagedIdentity - for the user assigned managed identity resources.

  • Microsoft.Storage - for the storage account resources.

To login to Azure using the command line, run the following:

az login --scope https://management.azure.com//.default

To explicitly check that CLI commands run against the correct subscription, run:

az account set --subscription "Subscription Name"

This allows you to perform necessary operations on Azure via Terraform using the User Principal on the Azure CLI.

Note
There are other ways to authenticate Terraform with Azure besides using the User Principal available via the Azure CLI. For other authentication scenarios, see the azurerm backend authentication docs.

Getting started with Terraform

The following assumes that you are authenticated on the Azure CLI, have Terraform installed on your machine, and are located a directory that you wish to use as the root of the Terraform configuration.

First, make the following terraform init call:

terraform init -from-module github.com/xtdb/xtdb.git//azure/terraform

This will download the Terraform files from the XTDB repository, and initialize the working directory.

Note
For the sake of this guide, we store Terraform state locally. However, to persist the state onto Azure, you will need to configure a remote backend using Azure Blob Storage. This allows you to share the state file across teams, maintain versioning, and ensure consistency during deployments. For more info, see the Terraform azurem backend documentation.

What is being deployed on Azure?

The sample Terraform directory sets up a few distinct parts of the infrastructure required by XTDB. If using the default configuration, the following will be created:

  • XTDB Resource Group and User Assigned Managed Identity

  • Azure Storage Account (with a container for object storage)

  • AKS Cluster

    • Configured with associated resources using the Azure/aks Terraform module.

  • Federated Identity Credential

    • Configures the AKS cluster to access the Azure Storage Account.

    • The credential is set up for the kubernetes_service_account_name within the kubernetes_namespace, both of which are defined in the terraform.tfvars file.

Note
The above infrastructure is designed for creating a simple starting point for running XTDB on Azure & Kubernetes. The VM sizes and resource tiers can & should be adjusted to suit your specific requirements and cost constraints, and the templates should be configured with any desired changes to security or networking configuration.

Deploying the Azure Infrastructure

Before creating the Terraform resources, review and update the terraform.tfvars file to ensure the parameters are correctly set for your environment:

  • You are required to set a unique and valid storage_account_name for your environment.

  • You may also wish to change resource tiers, the location of the resource group, or the VM sizes used by the AKS cluster.

    • The VM sizes used within the examples may not always be available in your subscription - if this is the case, see alternative/equivalent VM sizes that you can use within the Azure VM Sizes document.

    • Ensure that the quota for the VM size and region is set appropriately in Subscription > Settings > Usage + Quotas.

To get a full list of the resources that will be deployed by the templates, run:

terraform plan

Finally, to create the resources, run:

terraform apply

This will create all the resources within the Azure subscription and save the state of the resources within the storage account created earlier.

Fetching the Terraform Outputs

The Terraform templates will generate several outputs required for setting up the XTDB nodes on the AKS cluster.

To retrieve these outputs, execute the following command:

terraform output

This will return the following outputs:

  • storage_account_container

  • storage_account_name

  • user_managed_identity_client_id

Deploying on Kubernetes

With the infrastructure created on Azure, you can now deploy the XTDB nodes and a simple Kafka instance on the AKS cluster.

Prior to deploying the Kubernetes resources, ensure that the kubectl CLI is installed and configured to deploy and connect to the AKS cluster. Run the following command:

az aks get-credentials --resource-group xtdb-resource-group --name xtdb-aks-cluster

Now that kubectl is authenticated with the AKS cluster, you can set up the namespace for the XTDB deployment:

kubectl create namespace xtdb-deployment
Note
Within the deployed terraform infrastructure, we create a Federated Identity Credential for use by the XTDB statefulset, which by default will expect xtdb-deployment as the namespace and xtdb-service-account as the service account name. If you wish to change these, you will need to update the terraform.tfvars values accordingly.

The AKS cluster is now ready for deployment,


Deploying an example Kafka

To deploy a basic set of Kafka resources within AKS, you can make use of the bitnami/kafka Helm chart. Run the following command:

helm install kafka oci://registry-1.docker.io/bitnamicharts/kafka \
  --namespace xtdb-deployment \
  --set listeners.client.protocol=PLAINTEXT \
  --set listeners.controller.protocol=PLAINTEXT

This command will create:

  • A simple, unauthenticated Kafka deployment on the AKS cluster, which XTDB will use as its message log, along with its dependent infrastructure and persistent storage.

  • A Kubernetes service to expose the Kafka instance to the XTDB cluster.

Considerations of the Kafka Deployment

The Kafka instance set up above is for demonstration purposes only and is not recommended for production use. This example lacks authentication for the Kafka cluster and allows XTDB to manage Kafka topic creation and configuration itself.

For production environments, consider the following:

  • Use a more robust Kafka deployment.

  • Pre-create the required Kafka topics.

  • Configure XTDB appropriately to interact with the production Kafka setup.

Additional resources:

Verifying the Kafka Deployment

After deployment, verify that the Kafka instance is running properly by checking its status and logs.

To check the status of the Kafka deployment, run the following command:

kubectl get pods --namespace xtdb-deployment

To view the logs of the Kafka deployment, use the command:

kubectl logs -f statefulset/kafka-controller --namespace xtdb-deployment

By verifying the status and reviewing the logs, you can ensure the Kafka instance is correctly deployed and ready for use by XTDB.


Deploying the XTDB cluster

In order to deploy the XTDB cluster and it’s constituent parts into the AKS cluster, we provide an xtdb-azure Helm chart/directory.

This can be found on the XTDB Github Container Registry, and can be used directly with helm commands.

With the values from the Terraform outputs, you can now deploy the XTDB cluster. Run the following command, substituting the values as appropriate:

helm install xtdb-azure oci://ghcr.io/xtdb/helm-xtdb-azure \
  --version 2.0.0-snapshot \
  --namespace xtdb-deployment \
  --set xtdbConfig.storageContainerName=<storage_account_container> \
  --set xtdbConfig.storageAccountName=<storage_account_name> \
  --set xtdbConfig.userManagedIdentityClientId=<user_managed_identity_client_id>

The following are created by the templates:

  • A StatefulSet containing the XTDB nodes.

  • A PersistentVolumeClaim for each member of the StatefulSet (default size of 50 GiB, default storage class of managed-csi).

  • A LoadBalancer Kubernetes service to expose the XTDB cluster to the internet.

  • A ClusterIP service for exposing the Prometheus metrics from the nodes.

  • A ServiceAccount used for authenticating the XTDB nodes with the Azure Storage Account (setup with Federated Identity Credential within the terraform deployment step).

To check the status of the XTDB statefulset, run:

kubectl get statefulset --namespace xtdb-deployment

To view the logs of each individual StatefulSet member, run:

kubectl logs -f xtdb-statefulset-n --namespace xtdb-deployment

Customizing the XTDB Deployment

The above deployment uses the xtdb-azure chart defaults, individually setting the terraform outputs as xtdbConfig settings using the command line.

For more information on the available configuration options and fetching the charts locally for customization, see the xtdb-azure Helm documentation


Accessing the XTDB Cluster

Once the XTDB cluster is up and running, you can access it via the LoadBalancer service that was created.

To get the external IP of the LoadBalancer service, run:

kubectl get svc xtdb-service --namespace xtdb-deployment

This will return the external IP of the LoadBalancer service. You can use this IP to access the XTDB cluster via the Postgres Wire Server (on port 5432), or over the HTTP Server (on port 3000).

To check the status of the XTDB cluster using the HTTP server, run:

curl -X POST http://$ExternalIP:3000/status

If the above command succeeds, you now have a load-balanced XTDB cluster accessible over the internet.