AWS Event Bridge Event Store using DynamoDB

In this post, I would like to show how you can create an Event Store for AWS Event Bridge using Serverless technologies such as AWS Lambda and DynamoDB. First up, let’s discuss the technologies and concepts which we are building.

What is an Event Store?

An event store is a database or storage system designed to capture and persist a sequence of events. It is often used in event sourcing, a software design pattern where the state of an application is determined by a sequence of events rather than the current state alone.

In event sourcing, instead of storing the current state of an object or entity, the event store maintains a log of events that have occurred over time. Each event represents a discrete action or fact that has taken place within the system. These events are appended to the event store in the order they occur, creating an immutable and chronological event stream.

Event stores typically provide the following features:

Event Persistence: Events are stored persistently and durably, ensuring that they are not lost even in the event of system failures.
Append-only: Events are appended to the end of the event store and cannot be modified or deleted. This preserves the integrity and historical accuracy of the event log.
Event Retrieval: Event stores support querying and retrieving events based on different criteria such as time range, entity, or event type. This allows for reconstructing the state of an entity at any point in time by replaying the relevant events.
Scalability: Event stores are designed to handle large volumes of events efficiently, allowing for high-throughput event ingestion and retrieval.

Event stores are commonly used in systems that require auditability, traceability, or the ability to reconstruct the past state of an application. They are often utilized in domains such as financial systems, logistics, distributed systems, and event-driven architectures.

By replaying the events from an event store, applications can rebuild their state or generate projections for various purposes such as generating reports, analytics, or implementing temporal queries.

AWS Event Bridge

AWS EventBridge is a fully managed event bus service provided by Amazon Web Services (AWS). It enables event-driven architectures by allowing applications, services, and AWS resources to communicate and interact with each other through events. These events represent changes, actions, or updates in the system which can target other services and endpoints. Event Rules can be based on event patterns, event content, or time schedules. EventBridge is designed to be highly scalable and reliable, capable of handling a large volume of events with low latency and high throughput. It automatically scales to accommodate event bursts and ensures reliable event delivery.

AWS EventBridge simplifies event-driven architectures by providing a centralized event bus, flexible event routing, event transformation capabilities, and seamless integration with AWS services and third-party applications. It enables decoupled and scalable application architectures, allowing systems to react to events in real-time and build loosely coupled, modular, and extensible applications.

AWS Lambda

AWS Lambda is a serverless compute service provided by Amazon Web Services (AWS). It allows developers to run code without provisioning or managing servers. With Lambda, you can execute your code in response to events, such as changes to data in an Amazon S3 bucket, updates to a DynamoDB table, or HTTP requests via Amazon API Gateway.

Lambda follows a pay-as-you-go pricing model. You are billed only for the actual execution time of your code, measured in milliseconds. This makes Lambda highly cost-effective, as you don’t pay for idle resources. Lambda automatically scales your code in response to incoming requests. It provisions the necessary compute resources to handle the workload and ensures that your code runs reliably, even under heavy traffic.

Developing and deploying Lambda functions is straightforward. AWS provides a comprehensive set of tools and SDKs to help you write, test, and package your code. Deployment is as simple as uploading your code and configuring the event sources. AWS takes care of the rest, including scaling, monitoring, and error handling. AWS Lambda offers built-in logging and monitoring capabilities. You can view logs in Amazon CloudWatch, which provides detailed information about the execution of your functions.

DynamoDB

DynamoDB is a fully managed NoSQL database service designed for scalability, high availability, and low-latency performance. DynamoDB is a fully managed database service, which means AWS takes care of infrastructure provisioning, setup, configuration, and maintenance tasks. DynamoDB’s architecture enables seamless scalability to handle massive workloads. It automatically scales up or down to accommodate the volume of requests, ensuring consistent performance as your application grows.

DynamoDB integrates well with other AWS services and tools. It can be easily combined with AWS Lambda, Amazon S3, Amazon Redshift, Amazon EMR, and more, to create comprehensive and scalable architectures. DynamoDB also provides SDKs for various programming languages, making it accessible to developers using different technologies.

The problem with EventBridge….

AWS EventBridge does offer an archive feature which will store events which can be replayed. You can specify a date range for when you would like to replay these events back to the original event bus so that the consumers are re-triggered. This can be useful when a system had failed or data may have been wrongly interpreted due to a bug in the code.

One of the issues which I have faced during my time with Event Driven Architecture is that it is hard to follow the trail of events. Knowing what events were triggered and where potential errors have occurred can be difficult to determine. Rather than placing logs in different microservices and jumping around the log groups, I have used the Event Store pattern to improve the observability of event driven systems. It can be really useful to check that events are happening and in what order they occurred. By combining AWS Lamba and DynamoDB you can add a highly available serverless solution to this problem with a few simple lines of code. The following snippet has been simplified and can be extended for your use cases.

"use strict";

const REGION = process.env.AWS_REGION;
const ENVIRONMENT = process.env.ENVIRONMENT;
const STORE_TABLE = process.env.STORE_TABLE;
const dynamoDb = require("aws-sdk/clients/dynamodb");
const dynamoDbClient = new dynamoDb.DocumentClient({ region: REGION });

exports.lambdaHandler = async (event) => {
  try {
    await storeEvent(event);
  } catch (error) {
    if (error.code === "ConditionalCheckFailedException") {
      console.warn(`EventId: ${event.id} already exists in the dynamoDB`);
      console.log(error, JSON.stringify(event));
      return;
    }
    console.error(error, JSON.stringify(event));
    throw error;
  }
};

async function storeEvent(event) {
  let params = await generateParams(event);
  return putItem(params);
}

function generateParams(event) {
  const { detail, time, id, source, "detail-type": detailType } = event;
  const eventDetail = JSON.stringify(detail);
  const metaDataTime = extractMetaDataTime(detail);
  const primaryKey = extractPrimaryKey(detail);
  const partitionKeys = generatePartitionKeys(
    primaryKey,
    metaDataTime,
    detailType
  );
  return {
    TableName: STORE_TABLE,
    Item: {
      ...partitionKeys,
      eventDetail,
      primaryKey,
      eventId: id,
      eventTime: time,
      source,
      detailType,
      expireRecord: generateTTL(),
    },
    ConditionExpression: "attribute_not_exists(sk)",
  };
}

async function putItem(params) {
  return dynamoDbClient.put(params).promise();
}

function generateTTL() {
  const expireDate = new Date();
  expireDate.setFullYear(expireDate.getFullYear() + 1);
  return expireDate.getTime() / 1000;
}

function generatePartitionKeys(primaryKey, metaDataTime, detailType) {
  const pk = primaryKey;
  const sk = `${metaDataTime}#${detailType}`;
  return { pk, sk };
}

function extractPrimaryKey(detail) {
  if (Object.prototype.hasOwnProperty.call(detail, "data")) {
    return detail.data.primaryKey;
  } else {
    return detail.primaryKey;
  }
}

function extractMetaDataTime(detail) {
  if (Object.prototype.hasOwnProperty.call(detail, "metaData")) {
    return detail.metaData.time;
  } else {
    return new Date().toISOString();
  }
}

This code is an AWS Lambda function written in JavaScript. The Lambda will be triggered by events coming from EventBridge, which will be transformed into a DynamoDB item. The code will extract a primary key from the event detail’s data object, this is used as the partition key. A good primary key for the event store would be the domain of the event. The sort key is a composite key made up of the metadata date time and detail type. This event struture follows the metadata data event structure pattern. There is a great article here by Sheen Brisals from Lego on EventBridge and how they structure their events. The metadata object can contain information about the event, such as the producer, date time and correlation ID. Here is an example of what an event may look like:

{
  "id": "123456789",
  "detail": {
    "data": {
      "primaryKey": "examplePrimaryKey"
    },
    "metaData": {
      "time": "2023-06-08T04:58:04.873Z"
    }
  },
  "time": "2023-06-07T10:01:23Z",
  "source": "exampleSource",
  "detail-type": "exampleDetailType"
}

One thing to note here is that AWS EventBridge does have a root level date time when the event occurred. I have faced some issues when trying to use that date time as it does not use milliseconds, so any events which have the same detail type and second would overwrite each other. Another issue with this is that it will not help with idempotency and could mean any data which was replayed will create duplicated data in the DynamoDB.

Let’s summarize the functionality of the Lambda handler code from above.

The code begins with “use strict” to enforce strict mode, which helps catch common coding mistakes and enhances code quality. It retrieves environment variables such as AWS_REGION, ENVIRONMENT, and STORE_TABLE, which are used throughout the code.

The code imports the necessary AWS SDK clients, namely DynamoDB and Secrets Manager, using the aws-sdk library. It creates instances of the DynamoDB DocumentClient and SecretsManager clients, configured with the specified region.

The lambdaHandler function is the entry point for the Lambda function. It is an async function that takes an event object as input. Inside the lambdaHandler, the storeEvent function is called to store the event in DynamoDB. It awaits the completion of this function. If an error occurs during the storage process, the code handles specific error cases. If the error is a ConditionalCheckFailedException, it logs a warning indicating that the event already exists in DynamoDB and returns early. Otherwise, it logs an error message and throws the error.

The storeEvent function generates the parameters required to store the event in DynamoDB by calling the generateParams function. The generateParams function extracts relevant data from the event object and constructs the necessary DynamoDB putItem parameters. It creates a JSON document representing the event and includes attributes such as partition keys, event details, event ID, event time, source, detail type, and an expiration timestamp.

The putItem function calls the DynamoDB DocumentClient’s put method to store the item in DynamoDB. It returns a promise that resolves when the put operation completes.

The generateTTL function generates a time-to-live (TTL) value by adding one year to the current date. The generatePartitionKeys function constructs the partition keys for the DynamoDB item using the provided primary key, metadata time, and detail type. The extractPrimaryKey function extracts the primary key from the event’s detail object, depending on its structure. The extractMetaDataTime function extracts the metadata time from the event’s detail object, or if not available, it generates the current timestamp.

Deploying the code using AWS Sam

AWS SAM (Serverless Application Model) is an open-source framework provided by Amazon Web Services (AWS) that simplifies the development, deployment, and management of serverless applications. It extends the capabilities of AWS CloudFormation and provides a higher-level abstraction specifically tailored for serverless architectures.

The following template would deploy a DynamoDB table and a Lambda which will listen to all the events from the custom Orders event bus. Obviously, you could target any event bus in your system here by supplying an ARN of the event bus. The Lambda follows the principal of the least privilege, as it is only allowed the PutItem action on the DynamoDB table.

AWSTemplateFormatVersion: "2010-09-09"
Transform: AWS::Serverless-2016-10-31

Parameters:
  Environment:
    Type: String
    Default: dev

Resources:
  ### Event Sourcing / Event Store
  EventStoreTable:
    Type: AWS::DynamoDB::Table
    Properties:
      BillingMode: PAY_PER_REQUEST
      AttributeDefinitions:
        - AttributeName: "pk"
          AttributeType: "S"
        - AttributeName: "sk"
          AttributeType: "S"
      KeySchema:
        - AttributeName: "pk" # {domain}
          KeyType: "HASH"
        - AttributeName: "sk" # {Timestamp}#{eventType}
          KeyType: "RANGE"
      TimeToLiveSpecification:
        AttributeName: expireRecord
        Enabled: true

  EventStoreFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: src/
      Handler: app.lambdaHandler
      Runtime: nodejs18.x
      Environment:
        Variables:
          STORE_TABLE: !Ref EventStoreTable
          ENVIRONMENT: !Ref Environment
      Policies:
        - Statement:
            - Effect: Allow
              Action:
                - dynamodb:PutItem
              Resource: !GetAtt EventStoreTable.Arn
      Events:
        StoreOrdersEvents:
          Type: EventBridgeRule
          Properties:
            EventBusName: Orders
            Pattern: { "source": [{ "prefix": "" }] } # To match all events

Outputs:
  EventStoreDynamoDBArn:
    Description: "Event Store DynamoDB Arn"
    Value: !Ref EventStoreTable
    Export:
      Name: EventStoreDynamoDBArn

To deploy an AWS SAM template, you can follow these general steps:

Install and Configure the AWS SAM CLI. You can install it by following the instructions provided in the AWS SAM CLI documentation. Once installed, configure it by setting up your AWS credentials using the AWS CLI.
Use the sam deploy -- guided command to deploy the packaged SAM template to your AWS account. This command creates or updates the AWS resources defined in your template stack and will create a config file to speed up further deployments.

Summary

We have discussed a lot of different topics in this post. Hopefully, you can take away how important having an Event Store is and how easily it can be created using Serverless technologies. The example code can be extended to extract key data from your events and store data in a way which you can use to improve your observability of event driven architecture.

If you have any questions then please drop me a message here or connect with me on LinkedIn.