kafka原始碼解析之十二KafkaController(上篇)

阿新 • • 發佈：2019-01-08

class KafkaController(val config : KafkaConfig, zkClient: ZkClient, val brokerState: BrokerState) extends Logging with KafkaMetricsGroup {
……
private val controllerElector = new ZookeeperLeaderElector(controllerContext, ZkUtils.ControllerPath, onControllerFailover,
  onControllerResignation, config.brokerId)

/**
 * Invoked when the controller module of a Kafka server is started up. This does not assume that the current broker
 * is the controller. It merely registers the session expiration listener and starts the controller leader
 * elector
 */
def startup() = {
  inLock(controllerContext.controllerLock) {
    info("Controller starting up");
    registerSessionExpirationListener()//註冊一個會話超時的listener
    isRunning = true
    controllerElector.startup//啟動controllerElector
    info("Controller startup complete")
  }
}
}

其zk選舉的路徑為/controller/*，並且對zk叢集建立一個會話超時的listener

class SessionExpirationListener() extends IZkStateListener with Logging {
  this.logIdent = "[SessionExpirationListener on " + config.brokerId + "], "
  @throws(classOf[Exception])
  def handleStateChanged(state: KeeperState) {
    // do nothing, since zkclient will do reconnect for us.
  }
  /**
   * Called after the zookeeper session has expired and a new session has been created. You would have to re-create
   * any ephemeral nodes here.
   *
   * @throws Exception
   *             On any error.
   */
  @throws(classOf[Exception])
  def handleNewSession() {
    info("ZK expired; shut down all controller components and try to re-elect")
    inLock(controllerContext.controllerLock) {
      onControllerResignation()//當會話超時，重新連線上的時候，呼叫之前註冊在ZookeeperLeaderElector的onControllerResignation函式
      controllerElector.elect//重新選舉
    }
  }
}

因此重點關注ZookeeperLeaderElector內部的邏輯：

class ZookeeperLeaderElector(controllerContext: ControllerContext,
                             electionPath: String,
                             onBecomingLeader: () => Unit,
                             onResigningAsLeader: () => Unit,
                             brokerId: Int)
  extends LeaderElector with Logging {
  var leaderId = -1
  // create the election path in ZK, if one does not exist
  val index = electionPath.lastIndexOf("/")
  if (index > 0)
    makeSurePersistentPathExists(controllerContext.zkClient, electionPath.substring(0, index))
  val leaderChangeListener = new LeaderChangeListener

  def startup {
    inLock(controllerContext.controllerLock) {//其選舉路徑為/controller/*
      controllerContext.zkClient.subscribeDataChanges(electionPath, leaderChangeListener)
      elect//觸發選舉
    }
  }

  private def getControllerID(): Int = {
    readDataMaybeNull(controllerContext.zkClient, electionPath)._1 match {
       case Some(controller) => KafkaController.parseControllerId(controller)
       case None => -1
    }
  }
    
  def elect: Boolean = {
    val timestamp = SystemTime.milliseconds.toString
    val electString = Json.encode(Map("version" -> 1, "brokerid" -> brokerId, "timestamp" -> timestamp))
   
   leaderId = getControllerID 
    /* 
     * We can get here during the initial startup and the handleDeleted ZK callback. Because of the potential race condition, 
     * it's possible that the controller has already been elected when we get here. This check will prevent the following 
     * createEphemeralPath method from getting into an infinite loop if this broker is already the controller.
     */
    if(leaderId != -1) {
       debug("Broker %d has been elected as leader, so stopping the election process.".format(leaderId))
       return amILeader
    }

    try {//通過zk建立Ephemeral Node的方式來進行選舉，即如果存在併發情況下向zk的同一個路徑建立node的話，有且只有1個客戶端會建立成功，其它客戶端建立失敗，但是當建立成功的客戶端和zk的連結斷開之後，這個node也會消失，其它的客戶端從而繼續競爭
      createEphemeralPathExpectConflictHandleZKBug(controllerContext.zkClient, electionPath, electString, brokerId,
        (controllerString : String, leaderId : Any) => KafkaController.parseControllerId(controllerString) == leaderId.asInstanceOf[Int],
        controllerContext.zkSessionTimeout)
      info(brokerId + " successfully elected as leader")
      leaderId = brokerId
      onBecomingLeader()//如果成功，則自己成為leader
    } catch {
      case e: ZkNodeExistsException =>
        // If someone else has written the path, then
        leaderId = getControllerID 

        if (leaderId != -1)
          debug("Broker %d was elected as leader instead of broker %d".format(leaderId, brokerId))
        else
          warn("A leader has been elected but just resigned, this will result in another round of election")

      case e2: Throwable =>
        error("Error while electing or becoming leader on broker %d".format(brokerId), e2)
        resign()//發生異常，刪除路徑
    }
    amILeader
  }

  def close = {
    leaderId = -1
  }

  def amILeader : Boolean = leaderId == brokerId

  def resign() = {
    leaderId = -1
    deletePath(controllerContext.zkClient, electionPath)
  }

  /**
   * We do not have session expiration listen in the ZkElection, but assuming the caller who uses this module will
   * have its own session expiration listener and handler
   */
  class LeaderChangeListener extends IZkDataListener with Logging {
    /**
     * Called when the leader information stored in zookeeper has changed. Record the new leader in memory
     * @throws Exception On any error.
     */
    @throws(classOf[Exception])
    def handleDataChange(dataPath: String, data: Object) {
      inLock(controllerContext.controllerLock) {
        leaderId = KafkaController.parseControllerId(data.toString)
        info("New leader is %d".format(leaderId))
      }
    }

    /**
     * Called when the leader information stored in zookeeper has been delete. Try to elect as the leader
     * @throws Exception
     *             On any error.
     */
    @throws(classOf[Exception])
    def handleDataDeleted(dataPath: String) {//KafkaController在第一次啟動的時候沒有選舉成功，然後當其發現節點已經消失的時候，會重新觸發選舉
      inLock(controllerContext.controllerLock) {
        debug("%s leader change listener fired for path %s to handle data deleted: trying to elect as a leader"
          .format(brokerId, dataPath))
        if(amILeader)//可能之前自己的角色是leader，則重新選舉未必成為leader，則需要清除之前所有快取的內容
          onResigningAsLeader()
        elect//觸發選舉
      }
    }
  }
}

因此KafkaController成為leader分2種情況：

1. 第一次啟動的時候會主動觸發elect，如果被選舉成為leader，則做leader該做的事情

2. 第一次啟動的時候選舉失敗，則通過LeaderChangeListener監控/controller/*路徑，發現下面資料被刪除的時候，觸發handleDataDeleted，從而再次觸發選舉

12.2 kafkaController的初始化（leader）

從上節可以看到，KafkaController選舉成功則呼叫onBecomingLeader，當之前的leader再次觸發選舉的時候呼叫onResigningAsLeader，以上2個函式分別對應：onControllerFailover和onControllerResignation。

onControllerResignation很簡單，就是把裡面所有的模組shutdown或者登出掉：

def onControllerResignation() {
  // de-register listeners
  deregisterReassignedPartitionsListener()
  deregisterPreferredReplicaElectionListener()
  // shutdown delete topic manager
  if (deleteTopicManager != null)
    deleteTopicManager.shutdown()
  // shutdown leader rebalance scheduler
  if (config.autoLeaderRebalanceEnable)
    autoRebalanceScheduler.shutdown()
  inLock(controllerContext.controllerLock) {
    // de-register partition ISR listener for on-going partition reassignment task
    deregisterReassignedPartitionsIsrChangeListeners()
    // shutdown partition state machine
    partitionStateMachine.shutdown()
    // shutdown replica state machine
    replicaStateMachine.shutdown()
    // shutdown controller channel manager
    if(controllerContext.controllerChannelManager != null) {
      controllerContext.controllerChannelManager.shutdown()
      controllerContext.controllerChannelManager = null
    }
    // reset controller context
    controllerContext.epoch=0
    controllerContext.epochZkVersion=0
    brokerState.newState(RunningAsBroker)
  }
}

以上各種模組會在onControllerFailover介紹，onControllerFailover本質上就是開啟裡面所有的功能。

onControllerFailover的邏輯如下：

 def onControllerFailover() {
    if(isRunning) {
      info("Broker %d starting become controller state transition".format(config.brokerId))
      readControllerEpochFromZookeeper()
//記錄選舉的時鐘，每成功選舉一次，遞增1
      incrementControllerEpoch(zkClient)
/*leader初始化，具體內容見評註*/
      registerReassignedPartitionsListener()
      registerPreferredReplicaElectionListener()
      partitionStateMachine.registerListeners()
      replicaStateMachine.registerListeners()
      initializeControllerContext()
      replicaStateMachine.startup()
      partitionStateMachine.startup()
      controllerContext.allTopics.foreach(topic => partitionStateMachine.registerPartitionChangeListener(topic))
      info("Broker %d is ready to serve as the new controller with epoch %d".format(config.brokerId, epoch))
      brokerState.newState(RunningAsController)
      maybeTriggerPartitionReassignment()
      maybeTriggerPreferredReplicaElection()
      sendUpdateMetadataRequest(controllerContext.liveOrShuttingDownBrokerIds.toSeq)
      if (config.autoLeaderRebalanceEnable) {
        info("starting the partition rebalance scheduler")
        autoRebalanceScheduler.startup()
        autoRebalanceScheduler.schedule("partition-rebalance-thread", checkAndTriggerPartitionRebalance,
          5, config.leaderImbalanceCheckIntervalSeconds, TimeUnit.SECONDS)
      }
      deleteTopicManager.start()
    }
    else
      info("Controller has been shut down, aborting startup/failover")
  }

其中步驟如下：

1） 在/admin/reassign_partitions目錄註冊partitionReassignedListener監聽函式

2） 在/admin/preferred_replica_election目錄註冊preferredReplicaElectionListener監聽函式

3） 在/brokers/topics目錄註冊topicChangeListener監聽函式

4） 在/admin/delete_topics目錄註冊deleteTopicsListener監聽函式

5） 在/brokers/ids目錄註冊brokerChangeListener監聽函式

6） 初始化ControllerContext上下文，裡面包含了topic的各種元資料資訊，除此之外ControllerContext內部的ControllerChannelManager負責和kafka叢集內部的其它KafkaServer建立channel來進行通訊，TopicDeletionManager

負責刪除topic

7）通過replicaStateMachine初始化所有的replica狀態

8）通過partitionStateMachine初始化所有的partition狀態

9) 在brokers/topics/***(具體的topic名字)/目錄下注冊AddPartitionsListener函式

10) 通過處理之前啟動留下的partition重分配的情況

11) 處理之前啟動留下的replica重新選舉的情況

12）向其它KafkaServer傳送叢集topic的元資料資訊已進行資料的同步更新

13）根據配置是否開啟自動均衡

14）開始刪除topic

KafkaControl主要通過以上各種監聽函式來完成kafka叢集元資料的管理，接下來先詳細描述PartitionStateMachine和ReplicaStateMachine原理，因為kafka topic 的partition狀態和內容主要是通過以上2個管理類來實現的，然後按照上面的流程描述不同的listener的作用。

kafka原始碼解析之十二KafkaController(上篇)

class KafkaController(val config : KafkaConfig, zkClient: ZkClient, val brokerState: BrokerState) extends Logging with KafkaMetricsGroup { …… private val c

kafka原始碼解析之十六生產者流程(客戶端如何向topic傳送資料)

客戶端向topic傳送資料分為兩種方式：1.非同步，2同步。其配置為producer.type，如果為sync，則是同步傳送；如果為async，則是非同步傳送。客戶端程式碼如下： import kafka.javaapi.producer.Producer; import

kafka原始碼解析之十七消費者流程(客戶端如何獲取topic的資料)

Kafka消費資料的角色分為普通消費者和高階消費者，其介紹如下： 17.1 普通消費者特點：1）一個訊息讀取多次 2）在一個處理過程中只消費某個broker上的partition的部分訊息 3）必須在程式中跟蹤offset值 4）必須找出指定Topi

Android框架原始碼解析之（二）OKhttp

原始碼在：https://github.com/square/okhttp 包實在是太多了，OKhttp核心在這塊https://github.com/square/okhttp/tree/master/okhttp 直接匯入Android Studio中即可。基本使用：

Spring原始碼解析（十二）——AOP原理——@EnableAspectJAutoProxy

一、@EnableAspectJAutoProxy 第一步：註冊AnnotationAwareAspectJAutoProxyCreator 把AnnotationAwareAspectJAutoProxyCreator建立為RootBeanDefinition，加入

jdk原始碼解析（十二）——執行緒安全與鎖優化

上一節我們說了Java記憶體模型與執行緒、那麼我們這節來了解一下執行緒安全與鎖優化 1 概述在軟體業發展的初期，程式編寫都是以演算法為核心的，程式設計師會把資料和過程分別作為獨立的部分來考慮，資料代表問題空間中的客體，程式程式碼則用於處理這些資料，這種思維方式直接站在計算機的角度去抽象問題

springmvc原始碼解析之DispatcherServlet二

開發十年，就只剩下這套架構體系了！ >>>

Mybaits 原始碼解析（十二）----- Mybatis的事務如何被Spring管理？Mybatis和Spring事務中用的Connection是同一個嗎？

不知道一些同學有沒有這種疑問，為什麼Mybtis中要配置dataSource，Spring的事務中也要配置dataSource？那麼Mybatis和Spring事務中用的Connection是同一個嗎？我們常用配置如下  <bean id="sqlSessionFa

Spring原始碼解析之二（預設標籤的解析）

預設標籤解析概述：本節重點詳細分析預設標籤的解析過程。接上一篇文章講到parseBeanDefinitions(root, delegate); /** * Parse the elements at the root level in the document: * "impor

Java原始碼解析之可重入鎖ReentrantLock（二）

上文接Java原始碼解析之可重入鎖ReentrantLock（一）。接下來是tryLock方法。程式碼如下。從註釋中我們可以理解到，只有當呼叫tryLock時鎖沒有被別的執行緒佔用，tryLock才會獲取鎖。如果鎖沒有被另一個執行緒佔用，那麼就獲取鎖，並立刻返回true，並把鎖計數設定為1.

mybatis原始碼解析之Configuration載入（二）

概述上一篇我們講了configuation.xml中幾個標籤的解析，例如<properties>,<typeAlises>,<settings>等，今天我們來介紹剩下的兩個比較重要的標籤之一，<environments>，這個標籤主要用於我們訪問資料庫的配置

Android原始碼解析之（十）-->Launcher啟動流程

上一篇文章中我們講解了關於SystemServer程序相關的知識，我們知道SystemServer程序主要用於啟動系統的各種服務，二者其中就包含了負責啟動Launcher的服務，LauncherAppService。具體更多關於SystenServer的啟動

【玩轉cocos2d-x之十二】plist解析工具：Anti_TexturePacker

之前拿了一些別人的圖片素材，是用TexturePacker打包合成的，結果寫程式的時候不知道每個合成前小png圖的名字是什麼，只能一個一個從plist檔案中找，然後猜測對應的名字，再進行顯示，如果不對，

Android原始碼解析之（十一）-->應用程序啟動流程

本節主要是通過分析Activity的啟動過程介紹應用程式程序的啟動流程。關於Android的應用程序在android guide中有這樣的一段描述： By default, every application runs in its own Linu

# Mybatis原始碼解析之配置載入(二)

Mybatis原始碼解析之配置載入(二) 這一篇是承接上一篇文章Mybatis原始碼解析之配置載入(一),上一篇原本是想把整個配置載入都分析完全，然後發現內容還是比較多，所以決定分成兩篇來說好了，現在就開始剩下的配置分析。配置載入繼續回到parseConfigura

kafka原始碼分析之kafkacluster的管理-KafkaController

KafkaController 說明,這個例項主要用於對kafka cluster進行管理，一個kafka的cluster表示同一個zk環境下所有的broker的集合，在這個cluster中需要有一個broker被選舉成為leader,用於管理其它的broker的上線與

spring原始碼解析之IOC容器（二）------載入和註冊

　　上一篇跟蹤了IOC容器對配置檔案的定位，現在我們繼續跟蹤程式碼，看看IOC容器是怎麼載入和註冊配置檔案中的資訊的。開始之前，首先我們先來了解一下IOC容器所使用的資料結構-------BeanDefinition，它是一個上層介面，有很多實現類，分別對應不同的資料載體。我們平時開發的時候，也會定義很多po

Netty原始碼分析（十二）----- 心跳服務之 IdleStateHandler 原始碼分析

什麼是心跳機制？心跳說的是在客戶端和服務端在互相建立ESTABLISH狀態的時候，如何通過傳送一個最簡單的包來保持連線的存活，還有監控另一邊服務的可用性等。心跳包的作用保活Q：為什麼說心跳機制能保持連線的存活，它是叢集中或長連線中最為有效避免網路中斷的一個重要的保障措施？A：之所以說是&l

Kafka原始碼解析（二）---Log分析

上一篇文章講了LogSegment和Log的初始化，這篇來講講Log的主要操作有哪些。一般來說Log 的常見操作分為 4 大部分。 1. 高水位管理操作 2. 日誌段管理 3. 關鍵位移值管理 4. 讀寫操作其中關鍵位移值管理主要包含Log Start Offset 和 LEO等。 ## 高水位H

Alink漫談(十八) ：原始碼解析之多列字串編碼MultiStringIndexer

# Alink漫談(十八) ：原始碼解析之多列字串編碼MultiStringIndexer [ToC] ## 0x00 摘要 Alink 是阿里巴巴基於實時計算引擎 Flink 研發的新一代機器學習演算法平臺，是業界首個同時支援批式演算法、流式演算法的機器學習平臺。本文將帶領大家來分析Alink中

kafka原始碼解析之十二KafkaController(上篇)

12.2 kafkaController的初始化（leader）

相關推薦