Important: Streaming Data Solution for Amazon MSK will retire on January 18, 2025. After that time, all existing deployments will continue to work and existing customers will retain full control of their environments and data; however, the Solution will no longer be supported or maintained.
Overview
Streaming Data Solution for Amazon MSK allows you to capture streaming data using Amazon Managed Streaming for Apache Kafka (Amazon MSK), a massively scalable storage service capable of handling high data volume from data producers. A producer can consist of thousands of data sources, each continuously generating streaming data. These sources typically submit records simultaneously in small sizes (kilobytes).
Streaming data includes a wide variety of sources such as log files generated by customers using mobile or web applications, ecommerce purchases, in-game player activity, information from social networks, financial trading floors, or geospatial services and telemetry from connected devices or instrumentation in data centers.
This AWS Solution provides four AWS CloudFormation templates where data flows through producers, streaming storage, consumers, and destinations. Similar to Streaming Data Solution for Amazon Kinesis, the templates are configured to apply best practices to monitor functionality and secure data using dashboards and alarms.
Benefits
Technical details
You can automatically deploy this architecture using the implementation guide and the accompanying AWS CloudFormation templates.
-
Option 1
-
Option 2
-
Option 3
-
Option 4
-
Option 1
-
AWS CloudFormation template using Amazon Managed Streaming for Apache Kafka (Amazon MSK)
Step 1
This AWS CloudFormation template deploys an Amazon Managed Streaming for Apache Kafka (MSK) cluster.
Step 2
An Amazon Cognito user pool is used to control who can invoke REST API methods.About this deploymentPublish Date- Publish Date
Deployment optionsReady to get started?Deploy this solution by launching it in your AWS Console -
Option 2
-
AWS CloudFormation template using Amazon MSK and AWS Lambda
Step 1
This CloudFormation template deploys an AWS Lambda function that processes records in an Apache Kafka topic. The default function is a Node.js application that logs the received messages, but it can be customized to meet your business needs.About this deploymentPublish Date- Publish Date
Deployment optionsReady to get started?Deploy this solution by launching it in your AWS Console -
Option 3
-
AWS CloudFormation template using Amazon MSK, AWS Lambda, and Amazon Kinesis Data Firehose
Step 1
A Lambda function processes records in an Apache Kafka topic.Step 2
An Amazon Data Firehose delivery stream buffers data before delivering it to the destination.Step 3
An Amazon Simple Storage Service (Amazon S3) bucket stores all original events from the Amazon MSK cluster.About this deploymentPublish Date- Publish Date
Deployment optionsReady to get started?Deploy this solution by launching it in your AWS Console -
Option 4
-
AWS CloudFormation template using Amazon MSK, Amazon Managed Service for Apache Flink, and Amazon S3
Step 1
An Amazon Managed Service for Apache Flink Studio notebook reads events from an existing topic in an Amazon MSK cluster.Step 2
An S3 bucket stores the output.About this deploymentPublish Date- Publish Date
Deployment optionsReady to get started?Deploy this solution by launching it in your AWS Console
Related content
This post covers patterns and solutions that can be used to backup MSK topics to S3, which enables customers to reduce long-term data retention settings in MSK. Some customers store long term-data in MSK for data analytics and machine learning workloads. We share a pattern to simplify this architecture by offloading topics data in S3 and use S3 for analytics/ML.
In this self-paced course, you learn about the process for planning data analysis solutions and the various data analytic processes that are involved.