Skip to content

S3

S3 can be used as a shared object-store for XTDB’s remote storage module.

Note

The S3 module uses an SNS topic and SQS queues to maintain a local copy of the file listings on S3, saving on expensive/lengthy operations to list objects on S3.

If not using the CloudFormation stack, ensure that you have a similar setup with the SNS topic setup to receive notifications from your bucket, and that XTDB has all the relevant permissions it needs with S3, SNS, and SQS.

Setup

First, ensure the com.xtdb/xtdb-s3 Maven dependency is added to your XTDB node.

Infrastructure

Note
CloudFormation stack

We provide a parameterized CloudFormation stack to help set up everything that you need.

The stack accepts the (globally unique) name of an S3 bucket as an input - this will be created, and referenced in associated resources - and outputs the SNS topic ARN, to be used in your XTDB configuration.

To use S3 as the object store, the following infrastructure is required:

  1. An S3 bucket.

  2. An SNS topic to receive notifications from the S3 bucket around object creation/deletion.

  3. The bucket should be granted permissions to publish to the SNS topic:

    SNSTopicPolicy:
      Type: AWS::SNS::TopicPolicy
      Properties:
        PolicyDocument:
          Statement:
            - Effect: Allow
              Principal:
                Service: 's3.amazonaws.com'
              Action: sns:Publish
              Resource: !Ref SNSTopic
              Condition:
                ArnEquals:
                  aws:SourceArn: !Ref S3BucketArn
                StringEquals:
                  aws:SourceAccount: !Ref 'AWS::AccountId'
        Topics:
          - !Ref SNSTopic
  4. IAM policies which grant XTDB permission to:

    • the S3 bucket:

      Statement:
      - Effect: Allow
        Action:
          - 's3:GetObject'
          - 's3:PutObject'
          - 's3:DeleteObject'
          - 's3:ListBucket'
          - 's3:AbortMultipartUpload'
          - 's3:ListBucketMultipartUploads'
        Resource:
          - !Ref S3BucketArn
          - !Join [ '', [ !Ref S3BucketArn, '/*'] ]
    • subscribe to the SNS topic:

      Statement:
      - Effect: Allow
        Action:
          - 'sns:Subscribe'
          - 'sns:Unsubscribe'
        Resource:
          - !Ref SNSTopic
    • create and operate on SQS queues:

      Statement:
      - Effect: Allow
        Action:
          - 'sqs:CreateQueue'
          - 'sqs:GetQueueUrl'
          - 'sqs:GetQueueAttributes'
          - 'sqs:DeleteQueue'
          - 'sqs:DeleteMessage'
          - 'sqs:ReceiveMessage'
          - 'sqs:SetQueueAttributes'
        Resource:
          - '*'

Authentication

Authentication is done via the AWS SDK, using the default AWS credential provider chain. See the AWS documentation for setup instructions.

Configuration

To use the S3 module, include the following in your node configuration:

storage: !Remote
  objectStore: !S3
    # -- required

    # The name of the S3 bucket to use for the object store
    # (Can be set as an !Env value)
    bucket: "my-s3-bucket"

    # The ARN of the SNS topic which is collecting notifications from the S3 bucket.
    # (Can be set as an !Env value)
    snsTopicArn: "arn:aws:sns:region:account-id:my-sns-topic"

    # -- optional
    # A file path to prefix all of your files with
    # - for example, if "foo" is provided, all XTDB files will be located under a "foo" sub-directory
    # (Can be set as an !Env value)
    # prefix: my-xtdb-node

  localDiskCache: /var/cache/xtdb/object-store

If configured as an in-process node, you can also specify an S3Configurator instance - this is used to modify the requests sent to S3.

Examples

Clojure Kotlin

For examples on how to enable/configure the S3 module as part of your node, for each client library, see the individual driver documentation: