1. 程式人生 > >Amazon Kinesis Data Streams FAQs

Amazon Kinesis Data Streams FAQs

Q: What is an Amazon Kinesis Application?

An Amazon Kinesis Application is a data consumer that reads and processes data from an Amazon Kinesis data stream. You can build your applications using either Amazon Kinesis Data Analytics, Amazon Kinesis API or Amazon Kinesis Client Library (KCL).

Q: What is Amazon Kinesis Client Library (KCL)?

Amazon Kinesis Client Library (KCL) for Java | Python | Ruby | Node.js | .NET is a pre-built library that helps you easily build Amazon Kinesis Applications for reading and processing data from an Amazon Kinesis data stream.

KCL handles complex issues such as adapting to changes in data stream volume, load-balancing streaming data, coordinating distributed services, and processing data with fault-tolerance. KCL enables you to focus on business logic while building applications. KCL 2.x supports both the HTTP/1 GetRecords and HTTP/2 SubscribeToShard APIs with enhanced fan-out for retrieving data from a stream. KCL 1.x does not support the SubscribeToShard API or enhanced fan-out.

Q: How do I upgrade from KCL 1.x to 2.x to use SubscribeToShard and enhanced fan-out?

Visit the Kinesis Data Streams user documentation to learn how to upgrade from KCL 1.x to KCL 2.x.

Q: What is the SubscribeToShard API?

The SubscribeToShard API is a high performance streaming API that pushes data from shards to consumers over a persistent connection without a request cycle from the client. The SubscribeToShard API uses the HTTP/2 protocol to deliver data to registered consumers whenever new data arrives on the shard, typically within 70ms, offering ~65% faster delivery compared to the GetRecords API.. The consumers will enjoy fast delivery even when multiple registered consumers are reading from the same shard.

Q: Can I use SubscribeToShard without using enhanced fan-out?

No, SubscribeToShard requires the use of enhanced fan-out, which means you also need to register your consumer with the Kinesis Data Streams service before you can use SubscribeToShard.

Q: How long does the SubscribeToShard persistent connection last?

The persistent connection can last up to 5 minutes.

Q: Does the Kinesis Client Library (KCL) support SubscribeToShard?

Yes, version 2.x of the KCL uses SubscribeToShard and enhanced fan-out to retrieve data with high performance from a Kinesis data stream.

Q: Is there a cost associated with using SubscribeToShard?

No, there is no additional cost associated with SubscribeToShard, but you must use SubscribeToShard with enhanced fan-out which does have an additional hourly cost for each consumer-shard combination and per GB of data delivered by enhanced fan-out.

Q: Do I need to use enhanced fan-out if I want to use SubscribeToShard?

Yes, to use SubscribeToShard you need to register your consumers, and registration activates enhanced fan-out. By default, your consumer will utilize enhanced fan-out automatically when data is retrieved via SubscribeToShard.

Q: What is Amazon Kinesis Connector Library?

Amazon Kinesis Connector Library is a pre-built library that helps you easily integrate Amazon Kinesis Data Streams with other AWS services and third-party tools. Amazon Kinesis Client Library (KCL) for Java | Python | Ruby | Node.js | .NET is required for using Amazon Kinesis Connector Library. The current version of this library provides connectors to Amazon DynamoDB, Amazon Redshift, Amazon S3, and Elasticsearch. The library also includes sample connectors of each type, plus Apache Ant build files for running the samples.

Q: What is Amazon Kinesis Storm Spout?

Amazon Kinesis Storm Spout is a pre-built library that helps you easily integrate Amazon Kinesis Data Streams with Apache Storm. The current version of Amazon Kinesis Storm Spout fetches data from Amazon Kinesis data stream and emits it as tuples. You will add the spout to your Storm topology to leverage Amazon Kinesis Data Streams as a reliable, scalable, stream capture, storage, and replay service.

Q: What programming language are Amazon Kinesis Client Library (KCL), Amazon Kinesis Connector Library, and Amazon Kinesis Storm Spout available in?

Amazon Kinesis Client Library (KCL) is currently available in Java, Python, Ruby, Node.js, and .NET. Amazon Kinesis Connector Library and Amazon Kinesis Storm Spout are currently available in Java. We are looking to add support for other programming languages.

Q: Do I have to use Amazon Kinesis Client Library (KCL) for my Amazon Kinesis Application?

No, you can also use Amazon Kinesis API to build your Amazon Kinesis Application. However, we recommend using Amazon Kinesis Client Library (KCL) for Java | Python | Ruby | Node.js | .NET if applicable because it performs heavy-lifting tasks associated with distributed stream processing, making it more productive to develop applications.

Q: How does Amazon Kinesis Client Library (KCL) interact with an Amazon Kinesis Application?

Amazon Kinesis Client Library (KCL) for Java | Python | Ruby | Node.js | .NET acts as an intermediary between Amazon Kinesis Data Streams and your Amazon Kinesis Application. KCL uses the IRecordProcessor interface to communicate with your application. Your application implements this interface, and KCL calls into your application code using the methods in this interface.

Q: What is a worker and a record processor generated by Amazon Kinesis Client Library (KCL)?

An Amazon Kinesis Application can have multiple application instances and a worker is the processing unit that maps to each application instance. A record processor is the processing unit that processes data from a shard of an Amazon Kinesis data stream. One worker maps to one or more record processors. One record processor maps to one shard and processes records from that shard.

At startup, an application calls into Amazon Kinesis Client Library (KCL) for Java | Python | Ruby | Node.js | .NET to instantiate a worker. This call provides KCL with configuration information for the application, such as the data stream name and AWS credentials. This call also passes a reference to an IRecordProcessorFactory implementation. KCL uses this factory to create new record processors as needed to process data from the data stream. KCL communicates with these record processors using the IRecordProcessor interface.

Q: How does Amazon Kinesis Client Library (KCL) keep tracking data records being processed by an Amazon Kinesis Application?

Amazon Kinesis Client Library (KCL) for Java | Python | Ruby | Node.js | .NET automatically creates an Amazon DynamoDB table for each Amazon Kinesis Application to track and maintain state information such as resharding events and sequence number checkpoints. The DynamoDB table shares the same name with the application so that you need to make sure your application name doesn’t conflict with any existing DynamoDB tables under the same account within the same region.

All workers associated with the same application name are assumed to be working together on the same Amazon Kinesis data stream. If you run an additional instance of the same application code, but with a different application name, KCL treats the second instance as an entirely separate application also operating on the same data stream.

Please note that your account will be charged for the costs associated with the Amazon DynamoDB table in addition to the costs associated with Amazon Kinesis Data Streams.

For more information about how KCL tracks application state, see Tracking Amazon Kinesis Application state.

Q: How can I automatically scale up the processing capacity of my Amazon Kinesis Application using Amazon Kinesis Client Library (KCL)?

You can create multiple instances of your Amazon Kinesis Application and have these application instances run across a set of Amazon EC2 instances that are part of an Auto Scaling group. While the processing demand increases, an Amazon EC2 instance running your application instance will be automatically instantiated. Amazon Kinesis Client Library (KCL) for Java | Python | Ruby | Node.js | .NET will generate a worker for this new instance and automatically move record processors from overloaded existing instances to this new instance.

Q: Why does GetRecords call return empty result while there is data within my Amazon Kinesis data stream?

One possible reason is that there is no record at the position specified by the current shard iterator. This could happen even if you are using TRIM_HORIZON as shard iterator type. An Amazon Kinesis data stream represents a continuous stream of data. You should call GetRecords operation in a loop and the record will be returned when the shard iterator advances to the position where the record is stored.

Q: What is ApproximateArrivalTimestamp returned in GetRecords operation?

Each record includes a value called ApproximateArrivalTimestamp. It is set when the record is successfully received and stored by Amazon Kinesis. This timestamp has millisecond precision and there are no guarantees about the timestamp accuracy. For example, records in a shard or across a data stream might have timestamps that are out of order.

Q: What happens if the capacity limits of an Amazon Kinesis data stream are exceeded while Amazon Kinesis Application reads data from the data stream?

The capacity limits of an Amazon Kinesis data stream are defined by the number of shards within the data stream. The limits can be exceeded by either data throughput or the number of read data calls. While the capacity limits are exceeded, the read data call will be rejected with a ProvisionedThroughputExceeded exception. If this is due to a temporary rise of the data stream’s output data rate, retry by the Amazon Kinesis Application will eventually lead to completions of the requests. If this is due to a sustained rise of the data stream’s output data rate, you should increase the number of shards within your data stream to provide enough capacity for the read data calls to consistently succeed. In both cases, Amazon CloudWatch metrics allow you to learn about the change of the data stream’s output data rate and the occurrence of ProvisionedThroughputExceeded exceptions.

相關推薦

Amazon Kinesis Data Streams FAQs

Q: What is an Amazon Kinesis Application? An Amazon Kinesis Application is a data consumer that reads and processes data from an Amazon

Amazon Kinesis Data Streams Resources

This is a pre-built library that helps you easily integrate Amazon Kinesis Data Streams with other AWS services and third-party tools. Amazon Ki

Amazon Kinesis Data Streams getting started

Reducing the time to get actionable insights from data is important to all businesses and customers who employ batch data analytics tools are exp

Building a Data Processing Pipeline with Amazon Kinesis Data Streams and Kubeless

If you’re already running Kubernetes, FaaS (Functions as a Service) platforms on Kubernetes can help you leverage your existing investment in EC2

Amazon Kinesis Data Streams News

Two years ago we introduced Amazon Kinesis, which we now call Amazon Kinesis Streams, to allow customers to build applications that collect,

Amazon Kinesis Data Streams Pricing

Let’s assume that our data producers put 100 records per second in aggregate, and each record is 35KB. In this case, the total data input rate is

Amazon Kinesis Data Streams:AWS

Amazon Kinesis Data Streams (KDS) は、大規模にスケーラブルで持続的なリアルタイムのデータストリーミングサービスです。KDS はウエブサイトクリックストリームやデータべースイベントストリームや金融取引、ソーシャルメディアフィード、ITロゴ、ロケーション追跡イベ

Questions fréquentes (FAQ) sur Amazon Kinesis Data Streams

Q : Qu'est-ce qu'une application Amazon Kinesis ? Une application Amazon Kinesis est un consommateur de données qui lit et traite des do

Вопросы и ответы по Amazon Kinesis Data Streams

Вопрос: Что такое приложение Amazon Kinesis? Приложение Amazon Kinesis – это потребитель данных, который считывает и обрабатывает данные

Цены на Amazon Kinesis Data Streams

Предположим, что всего от источников данных поступает 100 записей в секунду, каждая запись размером 35 КБ. В этом случае общая скорость передачи в

Amazon Kinesis Data Streams 常見問題

問:什麼是 Amazon Kinesis 應用程式? 問:什麼是 Amazon Kinesis Client Library (KCL)? 適用於 Java | Python | Ruby | Node.js | .NET 的 Ama

Amazon Kinesis Data Streams 定價

讓我們假定我們的資料生產者平均每秒輸入 100 個記錄,每個記錄大小為 35KB。在這種情況下,總資料總輸入速率為 3.4MB/秒(100 個記錄/秒*35KB/記錄)。為方便起見,我們假設每次交易的吞吐量和記錄大小全天都是穩定不變的。請注意,我們可以隨時動態調整 Amazon Kinesi

Amazon Kinesis Data Firehose blog posts

Stream data into an Aurora PostgreSQL Database using AWS DMS and Amazon Kinesis Data Firehose In this blog post, we explore a solution to

Streaming CloudWatch Logs to Kinesis Data Streams

Amazon Web Services is Hiring. Amazon Web Services (AWS) is a dynamic, growing business unit within Amazon.com. We are currently hiring So

Amazon Kinesis Data Firehose Features

You can configure Amazon Kinesis Data Firehose to prepare your streaming data before it is loaded to data stores. Simply select an AWS Lambda fun

Amazon Kinesis Data Firehose Pricing

If you send 5,000 records of streaming data per second, each record 7KB in size, to Amazon Kinesis Data Firehose in US-East to be loaded into Amaz

Amazon Kinesis Data Firehose Resources

Reducing the time to get actionable insights from data is important to all businesses and customers who employ batch data analytics tools are ex

Amazon Kinesis Data Analytics_流資料處理分析服務

Amazon Kinesis Data Analytics 是實時處理流資料的一種最簡單的方法,採用的是標準 SQL 且無需瞭解新的程式語言或處理框架。通過 Amazon Kinesis Data Analytics,您能夠使用 SQL 查詢流資料或構建整個流式處理應用程式,以便獲取可行的

Amazon Kinesis Data Firehose_流資料捕獲載入服務

Amazon Kinesis Data Firehose 是將流資料可靠地載入到資料儲存和分析工具的最簡單方式。它可以捕獲、轉換流資料並將其載入到 Amazon S3、Amazon Redshift、Amazon Elasticsearch Service 和 Splunk,讓您可以藉助正在