curl $LoadBalancerUrl/status
Setting up a cluster on AWS
This guide will walk you through the process of configuring an XTDB cluster on AWS, featuring a customizable number of nodes operating on ECS, all accessible via a load balancer.
We’ll utilize a set of provided CloudFormation templates available in the XTDB repository. These will set up a cluster of XTDB nodes that use S3 as the XTDB object store, and MSK hosted Kafka as the XTDB log.
While we provide numerous parameters to configure these templates, you’re encouraged to edit them for more advanced use-cases or to reuse existing infrastructure where appropriate.
The guide assumes that you are using the default templates to set everything up.
The provided templates:
-
xtdb-vpc
: Sets up a sample VPC to be used by the other infrastructure -
xtdb-s3
: Sets up all S3 infrastructure & permissions that are required to be used for an XTDB Object Store -
xtdb-msk
: Sets up an MSK Kafka cluster for use as the XTDB log-
Depends on
xtdb-vpc
-
-
xtdb-alb
: Sets up application load balancer and all relevant infrastructure for that to route traffic to the ECS service-
Depends on
xtdb-vpc
-
-
xtdb-ecs
: Sets up XTDB on ECS, running on EC2 instances.-
Depends on
xtdb-vpc
,xtdb-s3
,xtdb-msk
andxtdb-alb
-
Setting up the stacks
Access the AWS CloudFormation dashboard (or use the CLI) and follow the instructions to set up each stack in sequence.
Setting up the VPC
The xtdb-vpc
template creates a sample VPC with networking components to be used by other infrastructure. This can be uploaded as is - no additional parameters are needed.
It will output the following:
-
VpcId
: ID of the created VPC. -
PublicSubnets
: IDs of the VPC’s public subnets -
PrivateSubnets
: IDs of the VPC’s private subnets -
SecurityGroupId
: ID of the VPC’s security group
Note
|
If you wish to group all of the resources created from all the templates under a single tag, you can apply them at the stack level when uploading the templates. This is optional, but may be useful for finding resources later. |
Setting up the S3 resources
The xtdb-s3
template sets up infrastructure & permissions necessary to use S3 as an XTDB Object Store.
It doesn’t depend on xtdb-vpc
, so you can upload both simultaneously. It requires the following input parameter:
-
S3BucketName
- Desired name of the bucket which will contain the XTDB Object Store
It will output the following:
-
BucketName
: Created S3 bucket name (identical toS3BucketName
) -
S3AccessPolicyArn
: ARN of the managed policy granting all of the relevant S3 permissions
Setting up the MSK cluster
The xtdb-msk
template establishes an MSK Kafka cluster to be used as the XTDB log, including necessary permissions. It requires xtdb-vpc
to be created first.
It takes the following input parameters:
-
VpcId
: ID of the VPC to host the MSK cluster on (fromxtdb-vpc
) -
SecurityGroupId
: ID of the security group to be used by the ECS nodes (fromxtdb-vpc
) -
PrivateSubnets
: List of private subnets to host the MSK brokers on, at least two (fromxtdb-vpc
) -
MSKClusterName
: Desired name for the MSK Kafka cluster - defaults toxtdb-cluster-kafka
-
MSKVolumeSize
: The size in GiB of the EBS volume for the data drive on each broker node of the Kafka cluster - defaults to100
It will output the following:
-
MSKClusterArn
: ARN of the created MSK cluster -
MSKClusterName
: Name of the created MSK cluster -
MSKAccessPolicyArn
: ARN of the managed policy granting MSK permissions
The xtdb-msk
stack will take longer to set up - usually around thirty minutes. This is because MSK will need to provision and set up Kafka. In the interim, you can set up xtdb-alb
(see below).
Following the creation of the xtdb-msk
stack, it’s crucial to perform an additional step to enable configuration of the XTDB nodes: retrieving the bootstrap servers from MSK.
Fetching the bootstrap servers
After the xtdb-msk
stack has been successfully created, you will need to get the list of Kafka bootstrap servers that the node will connect to. As these cannot be directly returned from CloudFormation itself, you will need to fetch them manually from the cluster on AWS.
The steps to do so (in the AWS console) are as follows:
-
Go to the MSK dashboard in the console.
-
This should lead you to a list of Clusters - select whichever one has the same name as
MSKClusterName
(this defaults toxtdb-cluster-kafka
if left unaltered) -
You should see an overview of the created MSK cluster - at the top, there will be a Cluster summary - with a link to "View client information". Click on this.
-
This should lead to a list of Bootstrap Servers - there should be a copy button on the "PLAINTEXT" ones.
-
These should be in a comma separated list - there should be two of them in total.
-
The Bootstrap Server URLs should be in something similar to the following shape:
b-2.<MSKClusterName>.<ID>.c5.kafka.<Region>.amazonaws.com:9092
. -
Ensure they end with port
9092
, as these are thePLAINTEXT
URLs.
-
Setting up the load balancer
The xtdb-alb
template configures an Application Load Balancer to route traffic across the ECS service. It will require will require xtdb-vpc
to have been created prior.
It takes the following input parameters:
-
VpcId
: ID of the VPC to host the load balancer on (fromxtdb-vpc
) -
SecurityGroupId
: Group ID of the security group to be used by the ECS nodes (fromxtdb-vpc
) -
PublicSubnets
: List of public subnets to host the load balancer on (fromxtdb-vpc
)
It will output the following:
-
TargetGroupArn
: ARN of the created target group -
LoadBalancerArn
: ARN of the created Application Load Balancer -
LoadBalancerUrl
: The load-balanced XTDB node URL - 'http://${ECSALB.DNSName}'
Setting up the nodes on ECS
The xtdb-ecs
template will set up our XTDB cluster on running as an ECS service, and will require all of the prior stacks to be created.
It splits it’s inputs into two distinct sections - parameters/resources from other stacks, and desired ECS Configuration.
-
Expected input parameters from other resources/stacks:
-
SecurityGroupId
: ID of the security group to be used by the ECS nodes (fromxtdb-vpc
) -
PublicSubnets
: List of public subnets to host the load balancer on (fromxtdb-vpc
) -
TargetGroupArn
: ARN of the target group created for the nodes (fromxtdb-alb
) -
LoadBalancerArn
: ARN of the Application Load Balancer created for the nodes (fromxtdb-alb
) -
S3BucketName
: Name of the S3 bucket to use as the XTDB object store (fromxtdb-s3
) -
S3AccessPolicyArn
: ARN of the managed policy offering access to all the S3 permissions necessary for the object store (fromxtdb-s3
) -
MSKBootstrapServers
: Comma separated list containing all Kafka bootstrap server URLs from MSK (needs to be grabbed manually from the MSK cluster info, see "Fetching the bootstrap servers") -
MSKAccessPolicyArn
: ARN of the managed policy offering access to all the MSK permissions (fromxtdb-msk
)
-
-
Expected input parameters for the configuration of ECS:
-
ClusterName
: Name of the desired ECS cluster - defaults toxtdb-cluster
-
EC2InstanceType
: EC2 instance type used for ECS Service - defaults toi3.large
(storage optimized) -
DesiredCapacity
: Number of EC2 instances to launch in your ECS cluster / XTDB node tasks to run - defaults to1
-
ImageId:
Used to grab an 'ECS optimized' image from SSM Parameter Store (We recommend that this is left as default)
-
After creation - there will now be a cluster of XTDB nodes running on ECS with the desired user configuration. These will be accessible via the LoadBalancerUrl
from xtdb-alb
.
Accessing the node
With the stacks set up in AWS, you should now be able to make calls to the nodes over HTTP using the LoadBalancerUrl
from the Application Load Balancer. You can call to GET
the status of one of the nodes:
Note
|
As our nodes are behind an application load balancer, be aware that messages sent over HTTP will be spread across the nodes, so you may see some differing values coming back from the status as each node in the cluster processes new transactions. |
Should the above be successful, you should be ready to go with an XTDB cluster!