1. 程式人生 > >Kafka 0.11.0.0 實現 producer的Exactly-once 語義(官方DEMO)

Kafka 0.11.0.0 實現 producer的Exactly-once 語義(官方DEMO)

A Kafka client that publishes records to the Kafka cluster.

The producer is thread safe and sharing a single producer instance across threads will generally be faster than having multiple instances.

Here is a simple example of using the producer to send records with strings containing sequential numbers as the key/value pairs.

 Properties props = new Properties();
 props.put(
"bootstrap.servers", "localhost:9092"); props.put("acks", "all"); props.put("retries", 0); props.put("batch.size", 16384); props.put("linger.ms", 1); props.put("buffer.memory", 33554432); props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer"); props.put("value.serializer
", "org.apache.kafka.common.serialization.StringSerializer"); Producer<String, String> producer = new KafkaProducer<>(props); for (int i = 0; i < 100; i++) producer.send(new ProducerRecord<String, String>("my-topic", Integer.toString(i), Integer.toString(i))); producer.close();

The producer consists of a pool of buffer space that holds records that haven't yet been transmitted to the server as well as a background I/O thread that is responsible for turning these records into requests and transmitting them to the cluster. Failure to close the producer after use will leak these resources.

The send() method is asynchronous. When called it adds the record to a buffer of pending record sends and immediately returns. This allows the producer to batch together individual records for efficiency.

The acks config controls the criteria under which requests are considered complete. The "all" setting we have specified will result in blocking on the full commit of the record, the slowest but most durable setting.

If the request fails, the producer can automatically retry, though since we have specified retries as 0 it won't. Enabling retries also opens up the possibility of duplicates (see the documentation on message delivery semantics for details).

The producer maintains buffers of unsent records for each partition. These buffers are of a size specified by the batch.size config. Making this larger can result in more batching, but requires more memory (since we will generally have one of these buffers for each active partition).

By default a buffer is available to send immediately even if there is additional unused space in the buffer. However if you want to reduce the number of requests you can set linger.ms to something greater than 0. This will instruct the producer to wait up to that number of milliseconds before sending a request in hope that more records will arrive to fill up the same batch. This is analogous to Nagle's algorithm in TCP. For example, in the code snippet above, likely all 100 records would be sent in a single request since we set our linger time to 1 millisecond. However this setting would add 1 millisecond of latency to our request waiting for more records to arrive if we didn't fill up the buffer. Note that records that arrive close together in time will generally batch together even with linger.ms=0 so under heavy load batching will occur regardless of the linger configuration; however setting this to something larger than 0 can lead to fewer, more efficient requests when not under maximal load at the cost of a small amount of latency.

The buffer.memory controls the total amount of memory available to the producer for buffering. If records are sent faster than they can be transmitted to the server then this buffer space will be exhausted. When the buffer space is exhausted additional send calls will block. The threshold for time to block is determined by max.block.ms after which it throws a TimeoutException.

The key.serializer and value.serializer instruct how to turn the key and value objects the user provides with their ProducerRecord into bytes. You can use the included ByteArraySerializer or StringSerializer for simple string or byte types.

From Kafka 0.11, the KafkaProducer supports two additional modes: the idempotent producer and the transactional producer. The idempotent producer strengthens Kafka's delivery semantics from at least once to exactly once delivery. In particular producer retries will no longer introduce duplicates. The transactional producer allows an application to send messages to multiple partitions (and topics!) atomically.

To enable idempotence, the enable.idempotence configuration must be set to true. If set, the retries config will be defaulted to Integer.MAX_VALUE, the max.in.flight.requests.per.connection config will be defaulted to 1, and acks config will be defaulted to all. There are no API changes for the idempotent producer, so existing applications will not need to be modified to take advantage of this feature.

To take advantage of the idempotent producer, it is imperative to avoid application level re-sends since these cannot be de-duplicated. As such, if an application enables idempotence, it is recommended to leave the retries config unset, as it will be defaulted to Integer.MAX_VALUE. Additionally, if a send(ProducerRecord) returns an error even with infinite retries (for instance if the message expires in the buffer before being sent), then it is recommended to shut down the producer and check the contents of the last produced message to ensure that it is not duplicated. Finally, the producer can only guarantee idempotence for messages sent within a single session.

To use the transactional producer and the attendant APIs, you must set the transactional.id configuration property. If the transactional.id is set, idempotence is automatically enabled along with the producer configs which idempotence depends on. Further, topics which are included in transactions should be configured for durability. In particular, the replication.factor should be at least 3, and the min.insync.replicas for these topics should be set to 2. Finally, in order for transactional guarantees to be realized from end-to-end, the consumers must be configured to read only committed messages as well.

The purpose of the transactional.id is to enable transaction recovery across multiple sessions of a single producer instance. It would typically be derived from the shard identifier in a partitioned, stateful, application. As such, it should be unique to each producer instance running within a partitioned application.

All the new transactional APIs are blocking and will throw exceptions on failure. The example below illustrates how the new APIs are meant to be used. It is similar to the example above, except that all 100 messages are part of a single transaction.

 Properties props = new Properties();
 props.put("bootstrap.servers", "localhost:9092");
 props.put("transactional.id", "my-transactional-id");
 Producer<String, String> producer = new KafkaProducer<>(props, new StringSerializer(), new StringSerializer());

 producer.initTransactions();

 try {
     producer.beginTransaction();
     for (int i = 0; i < 100; i++)
         producer.send(new ProducerRecord<>("my-topic", Integer.toString(i), Integer.toString(i)));
     producer.commitTransaction();
 } catch (ProducerFencedException | OutOfOrderSequenceException | AuthorizationException e) {
     // We can't recover from these exceptions, so our only option is to close the producer and exit.
     producer.close();
 } catch (KafkaException e) {
     // For all other exceptions, just abort the transaction and try again.
     producer.abortTransaction();
 }
 producer.close();

As is hinted at in the example, there can be only one open transaction per producer. All messages sent between the beginTransaction() and commitTransaction() calls will be part of a single transaction. When the transactional.id is specified, all messages sent by the producer must be part of a transaction.

The transactional producer uses exceptions to communicate error states. In particular, it is not required to specify callbacks for producer.send() or to call .get() on the returned Future: a KafkaException would be thrown if any of the producer.send() or transactional calls hit an irrecoverable error during a transaction. See the send(ProducerRecord) documentation for more details about detecting errors from a transactional send.

By calling producer.abortTransaction() upon receiving a KafkaException we can ensure that any successful writes are marked as aborted, hence keeping the transactional guarantees.

This client can communicate with brokers that are version 0.10.0 or newer. Older or newer brokers may not support certain client features. For instance, the transactional APIs need broker versions 0.11.0 or later. You will receive an UnsupportedVersionException when invoking an API that is not available in the running broker version.

相關推薦

Kafka 0.11.0.0 實現 producer的Exactly-once 語義官方DEMO

A Kafka client that publishes records to the Kafka cluster. The producer is thread safe and sharing a single producer instance across threads will generall

Kafka 0.11.0.0 是如何實現 Exactly-once 語義

很高興地告訴大家,具備新的里程碑意義的功能的Kafka 0.11.x版本(對應 Confluent Platform 3.3)已經release,該版本引入了exactly-once語義,本文闡述的內容包括:Apache Kafka的exactly-once語義;為什麼exactly-once是一個很難解決的

Kafka 0.11.0.0 實現 producer的Exactly-once 語義中文

很高興地告訴大家,具備新的里程碑意義的功能的Kafka 0.11.x版本(對應 Confluent Platform 3.3)已經release,該版本引入了exactly-once語義,本文闡述的內容包括: Apache Kafka的exactly-once語義; 為什麼exactly-once是一個很

Kafka 0.11.0.0 實現 producer的Exactly-once 語義英文

Exactly-once Semantics are Possible: Here’s How Kafka Does it I’m thrilled that we have hit an exciting milestone the Kafka community has long been waiting

thinkphp 5.0如何實現自定義404異常處理頁面

錯誤頁 自定義異常 異常錯誤 錯誤 load php 錯誤信息 art 正常 404頁面是客戶端在瀏覽網頁時,由於服務器無法正常提供信息,或是服務器無法回應,且不知道原因所返回的頁面。404承載著用戶體驗與SEO優化的重任。404頁面通常為用戶訪問了網站上不存在或已刪除的

易學筆記-0:Java語言總結/0.11 Java中輸出的流表示都是針對位元組陣列byte[ ]操作

Java中輸出的流表示 針對快取的: ByteArrayOutputStream StringBufferOutputStream 針對檔案的:FileOutputStream 針對物件:ObjectOutputStream

CentOS 7.0 使用MariaDB數據庫管理系統唐傑

數據庫 mariadb 管理系統 http://note.youdao.com/noteshare?id=94f07019c1e4af1a5a077591690e5c96本文出自 “新網學會博客” 博客,請務必保留此出處http://xwxhvip.blog.51cto.com/13020757/

MySQL 8.0實驗室---MySQL中的倒敘索引Descending Indexes

mysql 重新 .cn 創建表 https 正序 tro 一個 刪除 譯者註:MySQL 8.0之前,不管是否指定索引建的排序方式,都會忽略創建索引時候指定的排序方式(語法上不會報錯),最終都會創建為ASC方式的索引,在執行查詢的時候,只存在forwarded(正向

Aspnetcore2.0中Entityframeworkcore及Autofac的使用Demo

desc *** 結果 get rtu configure ogg netcore rri 一,通過Entityframeworkcore中DbFirst模式創建模型 這裏只說一下Entityframeworkcore中DbFirst模式創建模型,想了解CodeFirst的

Aspnetcore2.0中Entityframeworkcore及Autofac的使用Demo

-a new 自己 col 用途 spl isp aspnet ide 一,新建Aspnetcore項目 Aspnetcore是微軟家族中年紀較輕的新成員,但他的功能用途是其他前輩們望塵莫及的。想要知道他的功能特性大家可以問度娘也可以去官網查找一些資料。這裏主要給大家說一下

Aspnetcore2.0中Entityframeworkcore及Autofac的使用Demo2018-12-04 10:08

三,使用Autofac替換原有Ioc 首先安裝Autofac兩個外掛類庫: Autofac Autofac.Extensions.DependencyInjection 修改Startup.cs替換框架自帶IOC: // This method gets called by the runti

MySQL 8.0實驗室---MySQL中的倒序索引Descending Indexes

譯者注:MySQL 8.0之前,不管是否指定索引建的排序方式,都會忽略建立索引時候指定的排序方式(語法上不會報錯),最終都會建立為ASC方式的索引,在執行查詢的時候,只存在forwarded(正向)方式對索引進行掃描。關於正向索引和反向索引,邏輯上很容易理解,這裡有兩個相關的概念:正向索引或者反向(倒序)

RDIFramework.NET平臺程式碼生成器V3.0版本全新發布-更新於20160518提供下載

  RDIFramework.NET程式碼生成器V3.0版本修改了針對3.0版本的框架部分做了大量的調整,同時支援生成Web部分的UI程式碼(WebForm,MVC),基礎的工作交給工具,助力企業快速開發,真正提升了開發速度。   RDIFramework.NET框架做為資訊化系統快速開發、整合的框架,

串列埠除錯助手上輸入資料0-9,然後再數碼管顯示組合語言版本

//實驗目的:串列埠除錯助手上輸入資料0-9,然後再數碼管顯示 org 00H ljmp start org 23H //中斷入口地址 ljmp uart_interrupt  org 30H start:               mov P0,#0xff//設定

2017最新在swift3.0下整合iOS內購全流程附程式碼

最新寫的專案需要iOS內購功能所以就整理了這篇記錄,以便自己翻閱或者希望對讀者有所幫助。 因為之前一直沒做過內購這個模組,所以有所不足,請多多指教,謝謝啦~下面進入正題: 然後就沒然後了。。。下面進行詳細步驟,請仔細看圖片註釋: 1. 第一步

Python 0基礎開發遊戲:打地鼠詳細教程VS code版本

如果你沒有任何程式設計經驗,而且想嘗試一下學習程式設計開發,這個系列教程一定適合你,它將帶你學習最基本的Python語法,並讓你掌握小遊戲的開發技巧。你所需要的,就是付出一些時間和耐心來嘗試這些程式碼和操作。 @[top] 一、準備工作 1 下載安裝 python 2 下載安裝VS code編輯器 安裝時

Spark Streaming 中如何實現 Exactly-Once 語義

Exactly-once 語義是實時計算的難點之一。要做到每一條記錄只會被處理一次,即使伺服器或網路發生故障時也能保證沒有遺漏,這不僅需要實時計算框架本身的支援,還對上游的訊息系統、下游的資料儲存有所要求。此外,我們在編寫計算流程時也需要遵循一定規範,才能真正實

ztree實現權限功能橫向顯示

lose false 標記 console 多人 性能優化 發現 測試 func 最近在做權限功能的時候,采用的ztree實現的,但是產品要求最後一層的權限節點要橫向顯示。開始在網上找的解決方案是用css樣式把最後一層的display設置為inline。在我本地電腦上看了下

純SQL實現小算法輔助決策_ 計算商品評分、及時補貨

mysql分別把 計算各自的 1、點擊量/點擊量均值 2、銷售量/銷售量均值 兩者相加,可以得到一個簡單評分 又有問題了,豬肉的評分不應該比五花肉多。 因此我們要加入簡單的權重,譬如點擊量評分占30%。銷售量評分占70%select p_type,p_name, (p_view/view_avg)

phpqrcode實現二維碼含圖片

level con 二維碼 輸出 code evel eba hello include ---恢復內容開始--- 1,http://phpqrcode.sourceforge.net/ 下載 2,解壓以後只需要一個文件  3,原生php測試:    <?ph