1. 程式人生 > >OpenStack原始碼分析之Nova-Compute服務啟動過程(icehouse)

OpenStack原始碼分析之Nova-Compute服務啟動過程(icehouse)

學習OpenStack有半年多了,一直都停留在使用和trouble shooting的階段,最近有時間來好好研究了一下程式碼,因為以前是C++/Windows出生的,所以對Linux下面的Python開發不是很熟悉,本文適合一些已經使用過OpenStack並且想要初步瞭解程式碼工作原理的朋友,如果有什麼不對的地方歡迎指正

這裡分析的是Nova-Compute服務的開啟過程,其他服務如Nova-Scheduler, Nova-Conductor等的啟動過程類似(程式碼是基於GitHub上nova的icehouse版本的)

首先,當我們安裝完OpenStack的Nova-Compute元件後,通常我們會通過下面兩條命令來啟動服務:

service nova-compute start
start nova-compute

這兩條命令實際上是呼叫了upstart-job指令碼來啟動nova-compute服務的

首先我們可以在/etc/init/nova-compute.conf檔案中看到下面的shell啟動指令碼

chdir /var/run


pre-start script
        mkdir -p /var/run/nova
        chown root:root /var/run/nova/


        mkdir -p /var/lock/nova
        chown root:root /var/lock/nova/


        modprobe nbd
end script


exec start-stop-daemon --start --chuid root --exec /usr/local/bin/nova-compute -- --config-file=/etc/nova/nova.conf --config-file=/etc/nova/nova-compute.conf


前面幾條可以不用管,就是修改一些許可權之類的,最後一句exec start-stop-deamon指令碼真正的用來啟動/usr/local/bin/nova-compute服務,後面帶的--config引數相信童鞋們都很熟悉了,就是nova-compute服務的兩個配置檔案,這在後面的程式碼中會用到來初始化配置

接下來可以找到/usr/local/bin/nova-compute檔案

import sys

from nova.cmd.compute import main


if __name__ == "__main__":
    sys.exit(main())

這裡有點像C++的main函式啟動過程,呼叫了nova.cmd.compute的main方法
def main():
    config.parse_args(sys.argv)
    logging.setup('nova')
    utils.monkey_patch()
    objects.register_all()

    gmr.TextGuruMeditation.setup_autorun(version)

    if not CONF.conductor.use_local:
        block_db_access()
        objects_base.NovaObject.indirection_api = \
            conductor_rpcapi.ConductorAPI()

    server = service.Service.create(binary='nova-compute',
                                    topic=CONF.compute_topic,
                                    db_allowed=CONF.conductor.use_local)
    service.serve(server)
    service.wait()

分析其中的主要程式碼:
config.parse_args(sys.argv)
def parse_args(argv, default_config_files=None):
    options.set_defaults(sql_connection=_DEFAULT_SQL_CONNECTION,
                         sqlite_db='nova.sqlite')
    rpc.set_defaults(control_exchange='nova')
    nova_default_log_levels = (log.DEFAULT_LOG_LEVELS +
            ["keystonemiddleware=WARN", "routes.middleware=WARN"])
    log.set_defaults(default_log_levels=nova_default_log_levels)
    debugger.register_cli_opts()
    cfg.CONF(argv[1:],
             project='nova',
             version=version.version_string(),
             default_config_files=default_config_files)
    rpc.init(cfg.CONF)
利用sys.argv讀入在/etc/init/nova-compute.conf中指定的配置檔案,首先設定預設的資料庫連線,然後設定mq的預設exchange,設定日誌級別
cfg.CONF(argv[1:],
             project='nova',
             version=version.version_string(),
             default_config_files=default_config_files)
利用Oslo中的config包初始化一個全域性變數CONF,讀入配置
rpc.init(cfg.CONF)
根據配置檔案初始化mq的rpc連線

回到main函式,主要看

server = service.Service.create(binary='nova-compute',
                                    topic=CONF.compute_topic,
                                    db_allowed=CONF.conductor.use_local)
service.serve(server)
service.wait()
首先利用類方法create建立一個Service物件
@classmethod
    def create(cls, host=None, binary=None, topic=None, manager=None,
               report_interval=None, periodic_enable=None,
               periodic_fuzzy_delay=None, periodic_interval_max=None,
               db_allowed=True):
        """Instantiates class and passes back application object.

        :param host: defaults to CONF.host
        :param binary: defaults to basename of executable
        :param topic: defaults to bin_name - 'nova-' part
        :param manager: defaults to CONF.<topic>_manager
        :param report_interval: defaults to CONF.report_interval
        :param periodic_enable: defaults to CONF.periodic_enable
        :param periodic_fuzzy_delay: defaults to CONF.periodic_fuzzy_delay
        :param periodic_interval_max: if set, the max time to wait between runs

        """
        if not host:
            host = CONF.host
        if not binary:
            binary = os.path.basename(sys.argv[0])
        if not topic:
            topic = binary.rpartition('nova-')[2]
        if not manager:
            manager_cls = ('%s_manager' %
                           binary.rpartition('nova-')[2])
            manager = CONF.get(manager_cls, None)
        if report_interval is None:
            report_interval = CONF.report_interval
        if periodic_enable is None:
            periodic_enable = CONF.periodic_enable
        if periodic_fuzzy_delay is None:
            periodic_fuzzy_delay = CONF.periodic_fuzzy_delay

        debugger.init()

        service_obj = cls(host, binary, topic, manager,
                          report_interval=report_interval,
                          periodic_enable=periodic_enable,
                          periodic_fuzzy_delay=periodic_fuzzy_delay,
                          periodic_interval_max=periodic_interval_max,
                          db_allowed=db_allowed)

        return service_obj

host表示當前節點的hostname,用於記錄,topic用於連線mq,manager是在配置檔案中出現的Manager類,用於連線mq並消費對應的訊息,對於compute服務即
compute_manager=nova.compute.manager.ComputeManager

最後初始化該service物件並返回
def serve(server, workers=None):
    global _launcher
    if _launcher:
        raise RuntimeError(_('serve() can only be called once'))


    _launcher = service.launch(server, workers=workers)

def wait():
    _launcher.wait()
serve方法接收上面建立的service物件並呼叫nova.openstack.common中service.py下的launch方法(這裡我們可以看到nova.service.py下的Service類是nova.openstack.common.service.py下的Service類的子類)
def launch(service, workers=1):
    if workers is None or workers == 1:
        launcher = ServiceLauncher()
        launcher.launch_service(service)
    else:
        launcher = ProcessLauncher()
        launcher.launch_service(service, workers=workers)

    return launcher

由於workers為預設值,所以初始化ServiceLauncher物件並呼叫launch_service方法(在ServiceLauncher的父類Launcher中定義)
def launch_service(self, service):
        """Load and start the given service.

        :param service: The service you would like to start.
        :returns: None

        """
        service.backdoor_port = self.backdoor_port
        self.services.add(service)

backdoor_port是nova.conf中的一個配置項,關於它的說明可以在配置檔案中的註釋中看到,這裡跳過

services.add()方法相當於一個Service物件的一個數組,它將該service加入到陣列中並等待啟動,在Services中還用到了eventlet中的Event類,它就是一個普通的事件,在服務開啟時初始化並通知服務關閉(event.wait()和event.send(),下文會看到程式碼)

class Services(object):

    def __init__(self):
        self.services = []
        self.tg = threadgroup.ThreadGroup()
        self.done = event.Event()

    def add(self, service):
        self.services.append(service)
        self.tg.add_thread(self.run_service, service, self.done)

找到nova.openstack.common.threadgroup.py下的ThreadGroup定義
class ThreadGroup(object):
    """The point of the ThreadGroup class is to:

    * keep track of timers and greenthreads (making it easier to stop them
      when need be).
    * provide an easy API to add timers.
    """
    def __init__(self, thread_pool_size=10):
        self.pool = greenpool.GreenPool(thread_pool_size)
        self.threads = []
        self.timers = []

    def add_thread(self, callback, *args, **kwargs):
        gt = self.pool.spawn(callback, *args, **kwargs)
        th = Thread(gt, self)
        self.threads.append(th)
        return th
這裡用了greenpool來建立綠色的執行緒,所謂的綠色執行緒對於作業系統是透明的,由綠色執行緒庫本身來進行任務排程,也就是說這裡初始化一個10大小的執行緒池,實際對應到作業系統的native thread可能只有5個,在上面可以跑10個綠色執行緒,綠色執行緒之間切換不需要作業系統干預(因為對於作業系統來說還是同一個執行緒),可以避免執行緒切換的系統開銷(使用者態到核心態),同時又能有效避免死鎖等問題

spawn方法註冊了一個回掉函式run_service並傳入兩個引數

@staticmethod
    def run_service(service, done):
        """Service start wrapper.

        :param service: service to run
        :param done: event to wait on until a shutdown is triggered
        :returns: None

        """
        service.start()
        systemd.notify_once()
        done.wait()

這裡呼叫了start方法真正開啟了相關的服務,並呼叫了done.wait()也就是event.wait()方法來終止服務

由於傳入的service是nova.service.py下的Service類物件,這裡的start是重寫的:

def start(self):
        verstr = version.version_string_with_package()
        LOG.audit(_('Starting %(topic)s node (version %(version)s)'),
                  {'topic': self.topic, 'version': verstr})
        self.basic_config_check()
        self.manager.init_host()
        self.model_disconnected = False
        ctxt = context.get_admin_context()
        try:
            self.service_ref = self.conductor_api.service_get_by_args(ctxt,
                    self.host, self.binary)
            self.service_id = self.service_ref['id']
        except exception.NotFound:
            try:
                self.service_ref = self._create_service_ref(ctxt)
            except (exception.ServiceTopicExists,
                    exception.ServiceBinaryExists):
                # NOTE(danms): If we race to create a record with a sibling
                # worker, don't fail here.
                self.service_ref = self.conductor_api.service_get_by_args(ctxt,
                    self.host, self.binary)

        self.manager.pre_start_hook()

        if self.backdoor_port is not None:
            self.manager.backdoor_port = self.backdoor_port
        .......
start方法很長,主要功能是初始化manager物件並建立相應的mq消費者,icehouse版本對mq的操作都封裝到了Oslo的messaging.py中了,下次有時間再仔細分析其中的程式碼吧~