spark記錄（19）SparkStreaming之從kafkaBroker和zookeeper獲取offset，和使用zookeeper管理offset

阿新 • • 發佈：2019-03-07

col ext js ryu 配置 map readv meta gdi rgs

一、從kafkaBroker獲取offset

/**
 * 測試之前需要啟動kafka
 * @author root
 *
 */
public class GetTopicOffsetFromKafkaBroker {
    public static void main(String[] args) {
        
        Map<TopicAndPartition, Long> topicOffsets = getTopicOffsets("node1:9092,node2:9092,node3:9092", "mytopic");
        Set 
<Entry<TopicAndPartition, Long>> entrySet = topicOffsets.entrySet();
        for(Entry<TopicAndPartition, Long> entry : entrySet) {
            TopicAndPartition topicAndPartition = entry.getKey();
            Long offset = entry.getValue();
            String topic = topicAndPartition.topic();
             
int partition = topicAndPartition.partition();
            System.out.println("topic = "+topic+",partition = "+partition+",offset = "+offset);
        }
    
    }
    
    /**
     * 從kafka集群中得到當前topic，生產者在每個分區中生產消息的偏移量位置
     * @param KafkaBrokerServer
     * @param topic
     * @return
     */ 

    public static Map<TopicAndPartition,Long> getTopicOffsets(String KafkaBrokerServer, String topic){
        Map<TopicAndPartition,Long> retVals = new HashMap<TopicAndPartition,Long>();
        
        for(String broker:KafkaBrokerServer.split(",")){
            
            SimpleConsumer simpleConsumer = new SimpleConsumer(broker.split(":")[0],Integer.valueOf(broker.split(":")[1]), 64*10000,1024,"consumer"); 
            TopicMetadataRequest topicMetadataRequest = new TopicMetadataRequest(Arrays.asList(topic));
            TopicMetadataResponse topicMetadataResponse = simpleConsumer.send(topicMetadataRequest);
            
            for (TopicMetadata metadata : topicMetadataResponse.topicsMetadata()) {
                for (PartitionMetadata part : metadata.partitionsMetadata()) {
                    Broker leader = part.leader();
                    if (leader != null) { 
                        TopicAndPartition topicAndPartition = new TopicAndPartition(topic, part.partitionId()); 
                        
                        PartitionOffsetRequestInfo partitionOffsetRequestInfo = new PartitionOffsetRequestInfo(kafka.api.OffsetRequest.LatestTime(), 10000); 
                        OffsetRequest offsetRequest = new OffsetRequest(ImmutableMap.of(topicAndPartition, partitionOffsetRequestInfo), kafka.api.OffsetRequest.CurrentVersion(), simpleConsumer.clientId()); 
                        OffsetResponse offsetResponse = simpleConsumer.getOffsetsBefore(offsetRequest); 
                        
                        if (!offsetResponse.hasError()) { 
                            long[] offsets = offsetResponse.offsets(topic, part.partitionId()); 
                            retVals.put(topicAndPartition, offsets[0]);
                        }
                    }
                }
            }
            simpleConsumer.close();
        }
        return retVals;
    }
}

二、從zookeeper獲取offset

public class GetTopicOffsetFromZookeeper {
    
    public static Map<TopicAndPartition,Long> getConsumerOffsets(String zkServers,String groupID, String topic) { 
        Map<TopicAndPartition,Long> retVals = new HashMap<TopicAndPartition,Long>();
        
        ObjectMapper objectMapper = new ObjectMapper();
        CuratorFramework  curatorFramework = CuratorFrameworkFactory.builder()
                .connectString(zkServers).connectionTimeoutMs(1000)
                .sessionTimeoutMs(10000).retryPolicy(new RetryUntilElapsed(1000, 1000)).build();
        
        curatorFramework.start();
        
        try{
            String nodePath = "/consumers/"+groupID+"/offsets/" + topic;
            if(curatorFramework.checkExists().forPath(nodePath)!=null){
                List<String> partitions=curatorFramework.getChildren().forPath(nodePath);
                for(String partiton:partitions){
                    int partitionL=Integer.valueOf(partiton);
                    Long offset=objectMapper.readValue(curatorFramework.getData().forPath(nodePath+"/"+partiton),Long.class);
                    TopicAndPartition topicAndPartition=new TopicAndPartition(topic,partitionL);
                    retVals.put(topicAndPartition, offset);
                }
            }
        }catch(Exception e){
            e.printStackTrace();
        }
        curatorFramework.close();
        
        return retVals;
    } 
    
    
    public static void main(String[] args) {
        Map<TopicAndPartition, Long> consumerOffsets = getConsumerOffsets("node3:2181,node4:2181,node5:2181","zhy","mytopic");
        Set<Entry<TopicAndPartition, Long>> entrySet = consumerOffsets.entrySet();
        for(Entry<TopicAndPartition, Long> entry : entrySet) {
            TopicAndPartition topicAndPartition = entry.getKey();
            String topic = topicAndPartition.topic();
            int partition = topicAndPartition.partition();
            Long offset = entry.getValue();
            System.out.println("topic = "+topic+",partition = "+partition+",offset = "+offset);
        }
    }
}

三、使用zookeeper管理offset

public class UseZookeeperManageOffset {
    /**
     * 使用log4j打印日誌，“UseZookeeper.class” 設置日誌的產生類
     */
    static final Logger logger = Logger.getLogger(UseZookeeperManageOffset.class);
    
    
    public static void main(String[] args) {
        /**
         * 加載log4j的配置文件，方便打印日誌
         */
        ProjectUtil.LoadLogConfig();
        logger.info("project is starting...");
        
        /**
         * 從kafka集群中得到topic每個分區中生產消息的最大偏移量位置
         */
        Map<TopicAndPartition, Long> topicOffsets = GetTopicOffsetFromKafkaBroker.getTopicOffsets("node1:9092,node2:9092,node3:9092", "mytopic");
        
        /**
         * 從zookeeper中獲取當前topic每個分區 consumer 消費的offset位置
         */
        Map<TopicAndPartition, Long> consumerOffsets = 
                GetTopicOffsetFromZookeeper.getConsumerOffsets("node3:2181,node4:2181,node5:2181","zhy","mytopic");
        
        /**
         * 合並以上得到的兩個offset ，
         *     思路是：
         *         如果zookeeper中讀取到consumer的消費者偏移量，那麽就zookeeper中當前的offset為準。
         *         否則，如果在zookeeper中讀取不到當前消費者組消費當前topic的offset，就是當前消費者組第一次消費當前的topic，
         *             offset設置為topic中消息的最大位置。
         */
        if(null!=consumerOffsets && consumerOffsets.size()>0){
            topicOffsets.putAll(consumerOffsets);
        }
        /**
         * 如果將下面的代碼解開，是將topicOffset 中當前topic對應的每個partition中消費的消息設置為0，就是從頭開始。
         */
//        for(Map.Entry<TopicAndPartition, Long> item:topicOffsets.entrySet()){
//          item.setValue(0l);
//        }
        
        /**
         * 構建SparkStreaming程序，從當前的offset消費消息
         */
        JavaStreamingContext jsc = SparkStreamingDirect.getStreamingContext(topicOffsets,"zhy");
        jsc.start();
        jsc.awaitTermination();
        jsc.close();
        
    }
}

spark記錄（19）SparkStreaming之從kafkaBroker和zookeeper獲取offset，和使用zookeeper管理offset

col ext js ryu 配置 map readv meta gdi rgs 一、從kafkaBroker獲取offset /** * 測試之前需要啟動kafka * @author root * */ public class GetTopic

spark記錄（0）SparkStreaming算子操作

top 單詞 operation upd cor ins 參數 arc 奇數 1 foreachRDD output operation算子,必須對抽取出來的RDD執行action類算子，代碼才能執行。代碼：見上個隨筆例子 2 transform tr

spark記錄（3）spark算子之Transformation

ace 使用 ble pan 寫入 1.2 插入 get .text 一、map、flatMap、mapParations、mapPartitionsWithIndex 1.1　map map十分容易理解，他是將源JavaRDD的一個一個元素的傳入call方法，並經過算

spark記錄（4）spark算子之Action

lac atm ide replace action ret 加載再次 col Action類算子也是一類算子（函數）叫做行動算子，如foreach,collect，count等。Transformations類算子是延遲執行，Action類算子是觸發執行。一個appli

Spark實戰（一）SparkStreaming集成Kafka

round 形式寫入 some base cal 接下來會話支持 Spark Streaming + Kafka集成指南 Kafka項目在版本0.8和0.10之間引入了一個新的消費者API，因此有兩個獨立的相應Spark Streaming包可用。請選擇正確的包，

Spark介紹（三）SparkStreaming

一、SparkStreaming簡介 SparkStreaming是一個對實時資料流進行高通量、容錯處理的流式處理系統，可以對多種資料來源（如Kdfka、Flume、Twitter、Zero和TCP 套接字）進行類似Map、Reduce和Join等複雜操作，並將結果儲存到外部檔案系統、

SpringBoot（19）學習之使用RabbitMQ實現高併發介面優化

使用RabbitMQ改寫秒殺功能實現思路思路：減少資料庫訪問具體的實現流程就是 1.系統初始化，把商品庫存數量載入到Redis 2.收到請求，Redis預減庫存，庫存不足，直接返回，否則3 3.請求入隊，立即返回排隊中 4.請求

spark記錄（2）spark廣播變量與累加器

com exec 擁有資源錯誤更新 image 帶寬對象轉自：https://www.cnblogs.com/qingyunzong/p/8890483.html 一、概述在spark程序中，當一個傳遞給Spark操作(例如map和reduce)的函數在遠程

spark記錄（5）Spark運行流程及在不同集群中的運行過程

park 通知 dag 抽取存在的區別 kill 滿足 blog 摘自：https://www.cnblogs.com/qingyunzong/p/8945933.html 一、Spark中的基本概念（1）Application：表示你的應用程序（2）Driv

vue問題記錄（二）：cookie實現三天內免登陸，以及記住使用者名稱密碼等

首先，我們是要在自己的專案目錄下面建立一個資料夾，如下圖，然後就在我標記的地方，寫關於cookie的方法，獲取cookie,設定，清除等，如下圖程式碼如下，方便拷貝 //獲取cookie、 export functio

spark學習記錄（十三、SparkStreaming）

一、SparkStreaming簡介 SparkStreaming是流式處理框架，是Spark API的擴充套件，支援可擴充套件、高吞吐量、容錯的實時資料流處理，實時資料的來源可以是：Kafka, Flume, Twitter, ZeroMQ或者TCP sockets，並且可以使用高階功能的複雜

ActiveMQ（19）：Consumer高級特性之獨有消費者（Exclusive Consumer）

consumer高級特性之獨有消費者（exclusive consumer）一、簡介Queue中的消息是按照順序被分發到consumers的。然而，當你有多個consumers同時從相同的queue中提取消息時，你將失去這個保證。因為這些消息是被多個線程並發的處理。有的時候，保證消息按照順序處理是很重要的。如

從零開始學 Web 之 jQuery（六）為元素綁定多個相同事件，解綁事件

png 好用添加方式執行存在區別也會地址大家好，這裏是「從零開始學 Web 系列教程」，並在下列地址同步更新...... github：https://github.com/Daotin/Web 微信公眾號：Web前端之巔博客園：http://ww

Spark源碼研讀-散篇記錄（一）：SparkConf

wstring unless prop acl point view prior exce same 0 關於散篇記錄散篇記錄就是，我自己覺得有需要記錄一下以方便後來查找的內容，就記錄下來。 1 Spark版本 Spark 2.1.0。 2 說明源碼過程中所涉及的許多S

Spark原始碼分析之Spark Shell（上）

https://www.cnblogs.com/xing901022/p/6412619.html 文中分析的spark版本為apache的spark-2.1.0-bin-hadoop2.7。 bin目錄結構： -rwxr-xr-x. 1 bigdata bigdata 1089 Dec

Spark學習筆記（19）—— 遊戲日誌分析

1 資料 0 管理員登入 1 首次登入 2 上線 3 下線 1|2016年2月1日,星期一,10:01:08|10.51.4.168|李明剋星|法師|男|1|0|0/800000000 1|2016年2月1日,星期一,10:01:12|10.117.45.20|風道|道士|男

Spring Boot（19）---開發Web應用之Thymeleaf篇

Spring Boot（19）---開發Web應用之Thymeleaf篇前言 Web開發是我們平時開發中至關重要的，這裡就來介紹一下Spring Boot對Web開發的支援。正文 Spring Boot提供了spring-boot-starter-web為Web

前端之路：sql語句，表中隨機獲取一條記錄（資料）。（或者獲取隨機獲取多條（記錄）資料）

SELECT * FROM `tableName` AS t1 JOIN (SELECT ROUND(RAND() * ((SELECT MAX(id) FROM `ta

Spark學習記錄（三）核心API模組介紹

spark ------------- 基於hadoop的mr，擴充套件MR模型高效使用MR模型，記憶體型叢集計算，提高app處理速度。 spark特點 ------------- 速度:在記憶體中儲存中間結果。支援多種語言。Scala、Java、Python 內建了80+的運算元. 高階分析

Spark學習記錄（二）Spark叢集搭建

Hadoop Spark叢集搭建，以及IDEA遠端除錯環境：Hadoop-2.7.2 jdk-1.8 scala-2-11-12 spark-2.1.0 spark2.0.0開始，只支援Java8版本了，

spark記錄（19）SparkStreaming之從kafkaBroker和zookeeper獲取offset，和使用zookeeper管理offset

一、從kafkaBroker獲取offset

二、從zookeeper獲取offset

三、使用zookeeper管理offset

相關推薦