Detectron研讀和實踐二：getting _started例子faster_rcnn_R-50-FPN

阿新 • • 發佈：2019-01-14

關於Detectron的介紹可以參看我的上一篇部落格。此篇部落格主要是對Detectron的getting_started例子faster_rcnn_R-50-FPN模型的相關程式碼進行分析。

1.相關原理簡介

該模型主要涉及兩個網路模組：基於ResNet50的FPN特徵提取網路和Faster R-CNN目標檢測網路。實際上，該模型是對Feature Pyramid Networks for Object Detection這篇論文的實現。

1.1 FPN特徵提取網路

這裡寫圖片描述
上圖為FPN網路的示意圖，FPN網路利用深度卷積神經網路固有的多尺度金字塔結構構建特徵金字塔。具體來說，就是將卷積網路最高層的特徵圖進行上取樣（將尺寸進行2x放大）然後與卷積網路次高層經過1*1卷積後的特徵圖進行相加（橫向連線），形成特徵金字塔網路的一層。按照此操作自頂向下的逐層構建特徵金字塔的各層。特徵金字塔網路的預測是在各層分別進行的。

FPN將解析度低但語義強的上層特徵和語義弱但解析度高的下層特徵通過自頂向下的通路和橫向連線結合起來，使得網路的檢測效能有了很大的提升。

1.2 Faster R-CNN檢測網路

這裡寫圖片描述
上圖為Faster R-CNN的網路結構（預測階段），首先利用特徵提取網路提取特徵圖，然後給RPN網路進行處理生成可能包含目標區域的proposals，後面的Fast R-CNN分類器對proposals進行RoI pooling後進行分類和bbox的迴歸。

RPN是Faster R-CNN最為關鍵的部分，因為說白了，Faster R-CNN就是在Fast R-CNN的基礎上加了一個RPN進去。RPN是一個能夠在每個位置同時預測目標邊界框和屬於目標得分的全卷積網路。它通過端到端訓練能產生高質量的區域提名，這些區域提名被其後的Fast R-CNN用來做檢測。由於本文的重點不在分析相關原理，因此下面只把RPN的網路結構貼出來，有關的詳細介紹可以閱讀原論文或是去網上搜解讀Faster R-CNN的部落格。
這裡寫圖片描述

2.相關原始碼分析

2.1 train_net.py

train_net.py位於tools資料夾下，是detectron用來訓練網路的檔案。
主程式流程圖如下：

模型訓練的流程圖如下：

下面是該檔案主要程式段的摘錄，在作者的註釋基礎上補充了一些註釋。

"""Train a network with Detectron.""" 


from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from __future__ import unicode_literals

import argparse
import cv2  # NOQA (Must import before importing caffe2 due to bug in cv2)
import logging
import numpy as np
import os
import pprint
import re
import sys
import test_net

from caffe2.python import memonger
from caffe2.python import workspace

from core.config import assert_and_infer_cfg
from core.config import cfg
from core.config import get_output_dir
from core.config import merge_cfg_from_file
from core.config import merge_cfg_from_list
from datasets.roidb import combined_roidb_for_training
from modeling import model_builder
from utils import lr_policy
from utils.logging import setup_logging
from utils.training_stats import TrainingStats
import utils.c2
import utils.env as envu
import utils.net as nu

utils.c2.import_contrib_ops()
utils.c2.import_detectron_ops()

# OpenCL may be enabled by default in OpenCV3; disable it because it's not
# thread safe and causes unwanted GPU memory allocations.
cv2.ocl.setUseOpenCL(False)


def parse_args():
    parser = argparse.ArgumentParser(
        description='Train a network with Detectron'
    )
    parser.add_argument(
        '--cfg',
        dest='cfg_file',
        help='Config file for training (and optionally testing)',
        default=None,
        type=str
    )
    parser.add_argument(
        '--multi-gpu-testing',
        dest='multi_gpu_testing',
        help='Use cfg.NUM_GPUS GPUs for inference',
        action='store_true'
    )
    parser.add_argument(
        '--skip-test',
        dest='skip_test',
        help='Do not test the final model',
        action='store_true'
    )
    parser.add_argument(
        'opts',
        help='See lib/core/config.py for all options',
        default=None,
        nargs=argparse.REMAINDER
    )
    if len(sys.argv) == 1:
        parser.print_help()
        sys.exit(1)
    return parser.parse_args()


def main():
    # Initialize C2
    workspace.GlobalInit(
        ['caffe2', '--caffe2_log_level=0', '--caffe2_gpu_memory_tracking=1']
    )
    # Set up logging and load config options
    logger = setup_logging(__name__)
    logging.getLogger('roi_data.loader').setLevel(logging.INFO)
    args = parse_args()
    logger.info('Called with args:')
    logger.info(args)
    if args.cfg_file is not None:
        merge_cfg_from_file(args.cfg_file)
    if args.opts is not None:
        merge_cfg_from_list(args.opts)
    assert_and_infer_cfg()
    logger.info('Training with config:')
    logger.info(pprint.pformat(cfg))
    # Note that while we set the numpy random seed network training will not be
    # deterministic in general. There are sources of non-determinism that cannot
    # be removed with a reasonble execution-speed tradeoff (such as certain
    # non-deterministic cudnn functions).
    np.random.seed(cfg.RNG_SEED)
    # Execute the training run
    checkpoints = train_model()
    # Test the trained model
    if not args.skip_test:
        test_model(checkpoints['final'], args.multi_gpu_testing, args.opts)


def train_model():
    """Model training loop."""
    # 模型訓練主函式，主要完成模型的建立，迭代訓練，相關訓練統計資料記錄和權重檔案的定期及最終輸出
    logger = logging.getLogger(__name__)
    model, start_iter, checkpoints, output_dir = create_model()
    if 'final' in checkpoints:
        # The final model was found in the output directory, so nothing to do
        return checkpoints

    setup_model_for_training(model, output_dir)
    training_stats = TrainingStats(model) # 追蹤一些關鍵的訓練統計資料
    CHECKPOINT_PERIOD = int(cfg.TRAIN.SNAPSHOT_ITERS / cfg.NUM_GPUS)

    for cur_iter in range(start_iter, cfg.SOLVER.MAX_ITER):
        training_stats.IterTic()
        lr = model.UpdateWorkspaceLr(cur_iter, lr_policy.get_lr_at_iter(cur_iter))
        workspace.RunNet(model.net.Proto().name)
        if cur_iter == start_iter:
            nu.print_net(model)
        training_stats.IterToc()
        training_stats.UpdateIterStats()
        training_stats.LogIterStats(cur_iter, lr)

        if (cur_iter + 1) % CHECKPOINT_PERIOD == 0 and cur_iter > start_iter:
            checkpoints[cur_iter] = os.path.join(
                output_dir, 'model_iter{}.pkl'.format(cur_iter)
            )
            nu.save_model_to_weights_file(checkpoints[cur_iter], model)

        if cur_iter == start_iter + training_stats.LOG_PERIOD:
            # Reset the iteration timer to remove outliers from the first few
            # SGD iterations
            training_stats.ResetIterTimer()

        if np.isnan(training_stats.iter_total_loss):
            logger.critical('Loss is NaN, exiting...')
            model.roi_data_loader.shutdown()
            envu.exit_on_error()

    # Save the final model
    checkpoints['final'] = os.path.join(output_dir, 'model_final.pkl')
    nu.save_model_to_weights_file(checkpoints['final'], model)
    # Shutdown data loading threads
    model.roi_data_loader.shutdown()
    return checkpoints


def create_model():
    """Build the model and look for saved model checkpoints in case we can
    resume from one.
    """
    # 建立一個模型並尋找已被儲存的模型檢查點以便可以從檢查點處繼續，相當於支援斷點續訓
    logger = logging.getLogger(__name__)
    start_iter = 0
    checkpoints = {}
    output_dir = get_output_dir(training=True)
    if cfg.TRAIN.AUTO_RESUME:
        # Check for the final model (indicates training already finished)
        final_path = os.path.join(output_dir, 'model_final.pkl')
        if os.path.exists(final_path):
            logger.info('model_final.pkl exists; no need to train!')
            return None, None, {'final': final_path}, output_dir

        # Find the most recent checkpoint (highest iteration number)
        files = os.listdir(output_dir)
        for f in files:
            iter_string = re.findall(r'(?<=model_iter)\d+(?=\.pkl)', f)
            if len(iter_string) > 0:
                checkpoint_iter = int(iter_string[0])
                if checkpoint_iter > start_iter:
                    # Start one iteration immediately after the checkpoint iter
                    start_iter = checkpoint_iter + 1
                    resume_weights_file = f

        if start_iter > 0:
            # Override the initialization weights with the found checkpoint
            cfg.TRAIN.WEIGHTS = os.path.join(output_dir, resume_weights_file)
            logger.info(
                '========> Resuming from checkpoint {} at start iter {}'.
                format(cfg.TRAIN.WEIGHTS, start_iter)
            )

    logger.info('Building model: {}'.format(cfg.MODEL.TYPE))
    # 此處利用model_builder建立yaml配置檔案中制定的模型
    model = model_builder.create(cfg.MODEL.TYPE, train=True)
    if cfg.MEMONGER:
        optimize_memory(model)
    # Performs random weight initialization as defined by the model
    workspace.RunNetOnce(model.param_init_net)
    return model, start_iter, checkpoints, output_dir

# 後面還有...

2.2 model_builder.py

在2.1建立模型中，該句model = model_builder.create(cfg.MODEL.TYPE, train=True) 用到了lib/modeling資料夾中的model_builder.py檔案，它包含有許多Detectron模型建構函式，就像作者在檔案開頭的註釋中寫道的：
Detectron supports a large number of model types. The configuration space is
large. To get a sense, a given model is in element in the cartesian product of:

backbone (e.g., VGG16, ResNet, ResNeXt)
FPN (on or off)
RPN only (just proposals)
Fixed proposals for Fast R-CNN, RFCN, Mask R-CNN (with or without keypoints)
End-to-end model with RPN + Fast R-CNN (i.e., Faster R-CNN), Mask R-CNN, …
Different “head” choices for the model
… many configuration options …

A given model is made by combining many basic components. The result is flexible
though somewhat complex to understand at first.

利用model_builder.create()函式建立faster_rcnn_R-50-FPN模型的流程圖如下：

這裡貼出幾個重要函式，通過註釋進行分析。

# ---------------------------------------------------------------------------- #
# Generic recomposable model builders
#
# For example, you can create a Fast R-CNN model with the ResNet-50-C4 backbone
# with the configuration:
#
# MODEL:
#   TYPE: generalized_rcnn
#   CONV_BODY: ResNet.add_ResNet50_conv4_body
#   ROI_HEAD: ResNet.add_ResNet_roi_conv5_head
# ---------------------------------------------------------------------------- #

def generalized_rcnn(model):
    """This model type handles:
      - Fast R-CNN
      - RPN only (not integrated with Fast R-CNN)
      - Faster R-CNN (stagewise training from NIPS paper)
      - Faster R-CNN (end-to-end joint training)
      - Mask R-CNN (stagewise training from NIPS paper)
      - Mask R-CNN (end-to-end joint training)
    """
    return build_generic_detection_model(
        model,
        get_func(cfg.MODEL.CONV_BODY),
        add_roi_box_head_func=get_func(cfg.FAST_RCNN.ROI_BOX_HEAD),
        add_roi_mask_head_func=get_func(cfg.MRCNN.ROI_MASK_HEAD),
        add_roi_keypoint_head_func=get_func(cfg.KRCNN.ROI_KEYPOINTS_HEAD),
        freeze_conv_body=cfg.TRAIN.FREEZE_CONV_BODY
    )


def rfcn(model):
    # TODO(rbg): fold into build_generic_detection_model
    return build_generic_rfcn_model(model, get_func(cfg.MODEL.CONV_BODY))


def retinanet(model):
    # TODO(rbg): fold into build_generic_detection_model
    return build_generic_retinanet_model(model, get_func(cfg.MODEL.CONV_BODY))


# ---------------------------------------------------------------------------- #
# Helper functions for building various re-usable network bits
# ---------------------------------------------------------------------------- #

def create(model_type_func, train=False, gpu_id=0):
    """Generic model creation function that dispatches to specific model
    building functions.

    By default, this function will generate a data parallel model configured to
    run on cfg.NUM_GPUS devices. However, you can restrict it to build a model
    targeted to a specific GPU by specifying gpu_id. This is used by
    optimizer.build_data_parallel_model() during test time.
    """
    model = DetectionModelHelper(
        name=model_type_func,
        train=train,
        num_classes=cfg.MODEL.NUM_CLASSES,
        init_params=train
    )
    model.only_build_forward_pass = False
    model.target_gpu_id = gpu_id
    # 先呼叫get_func函式返回model_type_func指定的（generalized_rcnn）模型函式物件
    return get_func(model_type_func)(model) 


def get_func(func_name):
    """Helper to return a function object by name. func_name must identify a
    function in this module or the path to a function relative to the base
    'modeling' module.
    """
    if func_name == '':
        return None
    new_func_name = modeling.name_compat.get_new_name(func_name)
    # 若在配置檔案中指定的模型TYPE與經處理後的new_func_name不符，如TYPE是在本檔案420多行列出的
    # 棄用函式名rpn,fast-rcnn,faster-rcnn等，則換成統一的新名字，generalized_rcnn
    if new_func_name != func_name: 
        logger.warn(
            'Remapping old function name: {} -> {}'.
            format(func_name, new_func_name)
        )
        func_name = new_func_name
    # 嘗試在當前module尋找func_name（不帶.），若失敗，則在modeling目錄下尋找，
    # 並返回對應的函式物件
    try:
        parts = func_name.split('.')
        # Refers to a function in this module
        if len(parts) == 1:
            return globals()[parts[0]]
        # Otherwise, assume we're referencing a module under modeling
        module_name = 'modeling.' + '.'.join(parts[:-1])
        module = importlib.import_module(module_name)
        # 等價於module.parts[-1],如FPN.add_fpn_ResNet50_conv5_body
        return getattr(module, parts[-1]) 
    except Exception:
        logger.error('Failed to find function: {}'.format(func_name))
        raise

# 通過配置引數和介面函式_add_xxx_head等，將backbone,RPN,FPN,Fast R-CNN,Mask head,
# keypoint head等模組組合起來，構建一個通用檢測模型
def build_generic_detection_model(
    model,
    add_conv_body_func,
    add_roi_box_head_func=None,
    add_roi_mask_head_func=None,
    add_roi_keypoint_head_func=None,
    freeze_conv_body=False
):
    def _single_gpu_build_func(model):
        """Build the model on a single GPU. Can be called in a loop over GPUs
        with name and device scoping to create a data parallel model.
        """
        # Add the conv body (called "backbone architecture" in papers)
        # E.g., ResNet-50, ResNet-50-FPN, ResNeXt-101-FPN, etc.
        # add_conv_body_func=get_func(cfg.MODEL.CONV_BODY)
        blob_conv, dim_conv, spatial_scale_conv = add_conv_body_func(model) 
        if freeze_conv_body:
            for b in c2_utils.BlobReferenceList(blob_conv):
                model.StopGradient(b, b)

        if not model.train:  # == inference
            # Create a net that can be used to execute the conv body on an image
            # (without also executing RPN or any other network heads)
            model.conv_body_net = model.net.Clone('conv_body_net')

        head_loss_gradients = {
            'rpn': None,
            'box': None,
            'mask': None,
            'keypoints': None,
        }

        if cfg.RPN.RPN_ON:
            # Add the RPN head
            head_loss_gradients['rpn'] = rpn_heads.add_generic_rpn_outputs(
                model, blob_conv, dim_conv, spatial_scale_conv
            )

        if cfg.FPN.FPN_ON:
            # After adding the RPN head, restrict FPN blobs and scales to
            # those used in the RoI heads
            blob_conv, spatial_scale_conv = _narrow_to_fpn_roi_levels(
                blob_conv, spatial_scale_conv
            )

        if not cfg.MODEL.RPN_ONLY:
            # Add the Fast R-CNN head
            head_loss_gradients['box'] = _add_fast_rcnn_head(
                model, add_roi_box_head_func, blob_conv, dim_conv,
                spatial_scale_conv
            )

        if cfg.MODEL.MASK_ON:
            # Add the mask head
            head_loss_gradients['mask'] = _add_roi_mask_head(
                model, add_roi_mask_head_func, blob_conv, dim_conv,
                spatial_scale_conv
            )

        if cfg.MODEL.KEYPOINTS_ON:
            # Add the keypoint head
            head_loss_gradients['keypoint'] = _add_roi_keypoint_head(
                model, add_roi_keypoint_head_func, blob_conv, dim_conv,
                spatial_scale_conv
            )

        if model.train:
            loss_gradients = {}
            for lg in head_loss_gradients.values():
                if lg is not None:
                    loss_gradients.update(lg)
            return loss_gradients
        else:
            return None

    optim.build_data_parallel_model(model, _single_gpu_build_func)
    return model

# 後面還有...

2.3 FPN.py

承接上面_single_gpu_build_func函式中的

blob_conv, dim_conv, spatial_scale_conv = add_conv_body_func(model) # add_conv_body_func=get_func(cfg.MODEL.CONV_BODY)

由於cfg.MODEL.CONV_BODY在配置檔案中被設定為FPN.add_fpn_ResNet50_conv_body，因此下面來分析modeling/FPN.py檔案。
該部分涉及到的相關程式流程（即FPN.add_fpn_ResNet50_conv_body(model)函式）如下：

主要程式碼分析：

"""Functions for using a Feature Pyramid Network (FPN)."""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from __future__ import unicode_literals

import collections
import numpy as np

from core.config import cfg
from modeling.generate_anchors import generate_anchors
from utils.c2 import const_fill
from utils.c2 import gauss_fill
import modeling.ResNet as ResNet
import utils.blob as blob_utils
import utils.boxes as box_utils

# Lowest and highest pyramid levels in the backbone network. For FPN, we assume
# that all networks have 5 spatial reductions, each by a factor of 2. Level 1
# would correspond to the input image, hence it does not make sense to use it.
LOWEST_BACKBONE_LVL = 2   # E.g., "conv2"-like level
HIGHEST_BACKBONE_LVL = 5  # E.g., "conv5"-like level


# ---------------------------------------------------------------------------- #
# FPN with ResNet
# ---------------------------------------------------------------------------- #

def add_fpn_ResNet50_conv5_body(model): #利用ResNet50_conv5_body構建fpn網路
    return add_fpn_onto_conv_body(
        model, ResNet.add_ResNet50_conv5_body, fpn_level_info_ResNet50_conv5
    )


def add_fpn_ResNet50_conv5_P2only_body(model):
    return add_fpn_onto_conv_body(
        model,
        ResNet.add_ResNet50_conv5_body,
        fpn_level_info_ResNet50_conv5,
        P2only=True
    )
# 此處省略add_fpn_ResNet101和add_fpn_ResNet152相關的conv_body函式

# ---------------------------------------------------------------------------- #
# Functions for bolting FPN onto a backbone architectures
# ---------------------------------------------------------------------------- #

def add_fpn_onto_conv_body(
    model, conv_body_func, fpn_level_info_func, P2only=False
):
    """Add the specified conv body to the model and then add FPN levels to it.
    """
    # Note: blobs_conv is in revsersed order: [fpn5, fpn4, fpn3, fpn2]
    # similarly for dims_conv: [2048, 1024, 512, 256]
    # similarly for spatial_scales_fpn: [1/32, 1/16, 1/8, 1/4]

    conv_body_func(model)
    blobs_fpn, dim_fpn, spatial_scales_fpn = add_fpn(
        model, fpn_level_info_func()
    )

    if P2only: # P2指FPN論文中conv2層對應的FPN輸出
        # use only the finest level
        return blobs_fpn[-1], dim_fpn, spatial_scales_fpn[-1]
    else:
        # use all levels
        return blobs_fpn, dim_fpn, spatial_scales_fpn


def add_fpn(model, fpn_level_info):
    """Add FPN connections based on the model described in the FPN paper."""
    # FPN levels are built starting from the highest/coarest level of the
    # backbone (usually "conv5"). First we build down, recursively constructing
    # lower/finer resolution FPN levels. Then we build up, constructing levels
    # that are even higher/coarser than the starting level.
    # 從backbone的最高層（一般為conv5）開始，先向下遞迴地建立FPN(P5,P4,P3,...)，
    # 然後回到開始的level(conv5)，向上建立更高層的level(如P6)，該函式會返回各層的blob
    fpn_dim = cfg.FPN.DIM
    min_level, max_level = get_min_max_levels()
    # Count the number of backbone stages that we will generate FPN levels for
    # starting from the coarest backbone stage (usually the "conv5"-like level)
    # E.g., if the backbone level info defines stages 4 stages: "conv5",
    # "conv4", ... "conv2" and min_level=2, then we end up with 4 - (2 - 2) = 4
    # backbone stages to add FPN to.
    # 可以想象成總共有len(fpn_level_info.blobs)層堆疊，LOWEST_BACKBONE_LVL
    # 代表最低層編號，min_level代表人為要取的最低層編號

    num_backbone_stages = (
        len(fpn_level_info.blobs) - (min_level - LOWEST_BACKBONE_LVL) 
    )
    lateral_input_blobs = fpn_level_info.blobs[:num_backbone_stages]
    output_blobs = [
        'fpn_inner_{}'.format(s)
        for s in fpn_level_info.blobs[:num_backbone_stages]
    ]
    fpn_dim_lateral = fpn_level_info.dims
    xavier_fill = ('XavierFill', {})

    # For the coarest backbone level: 1x1 conv only seeds recursion
    model.Conv(
        lateral_input_blobs[0],
        output_blobs[0],
        dim_in=fpn_dim_lateral[0],
        dim_out=fpn_dim,
        kernel=1,
        pad=0,
        stride=1,
        weight_init=xavier_fill,
        bias_init=const_fill(0.0)
    )

    #
    # Step 1: recursively build down starting from the coarsest backbone level
    #

    # For other levels add top-down and lateral connections
    for i in range(num_backbone_stages - 1):
        add_topdown_lateral_module(
            model,
            output_blobs[i],             # top-down blob
            lateral_input_blobs[i + 1],  # lateral blob
            output_blobs[i + 1],         # next output blob
            fpn_dim,                     # output dimension
            fpn_dim_lateral[i + 1]       # lateral input dimension
        )

    # Post-hoc（事後，因果顛倒） scale-specific 3x3 convs
    # 接著又從下往上對橫向連線輸出後的blob進行3*3卷積，
    # 將結果依次存入blobs_fpn列表中
    blobs_fpn = []
    spatial_scales = []
    for i in range(num_backbone_stages):
        fpn_blob = model.Conv(
            output_blobs[i],
            'fpn_{}'.format(fpn_level_info.blobs[i]),
            dim_in=fpn_dim,
            dim_out=fpn_dim,
            kernel=3,
            pad=1,
            stride=1,
            weight_init=xavier_fill,
            bias_init=const_fill(0.0)
        )
        blobs_fpn += [fpn_blob]
        spatial_scales += [fpn_level_info.spatial_scales[i]]

    #
    # Step 2: build up starting from the coarsest backbone level
    #

    # Check if we need the P6 feature map
    if not cfg.FPN.EXTRA_CONV_LEVELS and max_level == HIGHEST_BACKBONE_LVL + 1:
        # Original FPN P6 level implementation from our CVPR'17 FPN paper
        P6_blob_in = blobs_fpn[0]
        P6_name = P6_blob_in + '_subsampled_2x'
        # Use max pooling to simulate stride 2 subsampling
        P6_blob = model.MaxPool(P6_blob_in, P6_name, kernel=1, pad=0, stride=2)
        blobs_fpn.insert(0, P6_blob)
        spatial_scales.insert(0, spatial_scales[0] * 0.5)

    # Coarser FPN levels introduced for RetinaNet
    if cfg.FPN.EXTRA_CONV_LEVELS and max_level > HIGHEST_BACKBONE_LVL:
        fpn_blob = fpn_level_info.blobs[0]
        dim_in = fpn_level_info.dims[0]
        for i in range(HIGHEST_BACKBONE_LVL + 1, max_level + 1):
            fpn_blob_in = fpn_blob
            if i > HIGHEST_BACKBONE_LVL + 1:
                fpn_blob_in = model.Relu(fpn_blob, fpn_blob + '_relu')
            fpn_blob = model.Conv(
                fpn_blob_in,
                'fpn_' + str(i),
                dim_in=dim_in,
                dim_out=fpn_dim,
                kernel=3,
                pad=1,
                stride=2,
                weight_init=xavier_fill,
                bias_init=const_fill(0.0)
            )
            dim_in = fpn_dim
            blobs_fpn.insert(0, fpn_blob)
            spatial_scales.insert(0, spatial_scales[0] * 0.5)

    return blobs_fpn, fpn_dim, spatial_scales


def add_topdown_lateral_module(
    model, fpn_top, fpn_lateral, fpn_bottom, dim_top, dim_lateral
):
    """Add a top-down lateral module."""
    # Lateral 1x1 conv
    lat = model.Conv(
        fpn_lateral,
        fpn_bottom + '_lateral',
        dim_in=dim_lateral,
        dim_out=dim_top,
        kernel=1,
        pad=0,
        stride=1,
        weight_init=(
            const_fill(0.0) if cfg.FPN.ZERO_INIT_LATERAL
            else ('XavierFill', {})
        ),
        bias_init=const_fill(0.0)
    )
    # Top-down 2x upsampling
    td = model.net.UpsampleNearest(fpn_top, fpn_bottom + '_topdown', scale=2)
    # Sum lateral and top-down
    model.net.Sum([lat, td], fpn_bottom)


def get_min_max_levels():
    """The min and max FPN levels required for supporting RPN and/or RoI
    transform operations on multiple FPN levels.
    """
    min_level = LOWEST_BACKBONE_LVL
    max_level = HIGHEST_BACKBONE_LVL
    if cfg.FPN.MULTILEVEL_RPN and not cfg.FPN.MULTILEVEL_ROIS:
        max_level = cfg.FPN.RPN_MAX_LEVEL
        min_level = cfg.FPN.RPN_MIN_LEVEL
    if not cfg.FPN.MULTILEVEL_RPN and cfg.FPN.MULTILEVEL_ROIS:
        max_level = cfg.FPN.ROI_MAX_LEVEL
        min_level = cfg.FPN.ROI_MIN_LEVEL
    if cfg.FPN.MULTILEVEL_RPN and cfg.FPN.MULTILEVEL_ROIS:
        max_level = max(cfg.FPN.RPN_MAX_LEVEL, cfg.FPN.ROI_MAX_LEVEL)
        min_level = min(cfg.FPN.RPN_MIN_LEVEL, cfg.FPN.ROI_MIN_LEVEL)
    return min_level, max_level


# ---------------------------------------------------------------------------- #
# RPN with an FPN backbone
# ---------------------------------------------------------------------------- #
 # 會被rpn_heads.py中的add_generic_rpn_outputs函式呼叫
def add_fpn_rpn_outputs(model, blobs_in, dim_in, spatial_scales):
    """Add RPN on FPN specific outputs."""
    num_anchors = len(cfg.FPN.RPN_ASPECT_RATIOS) # 三種方向比例[0.5, 1, 2]
    dim_out = dim_in

    k_max = cfg.FPN.RPN_MAX_LEVEL  # coarsest level of pyramid，default is 6
    k_min = cfg.FPN.RPN_MIN_LEVEL  # finest level of pyramid, default is 2
    assert len(blobs_in) == k_max - k_min + 1
    for lvl in range(k_min, k_max + 1):
        # blobs_in is in reversed order,bl_in starts from blobs_in[4],that is finest level
        bl_in = blobs_in[k_max - lvl]  
        sc = spatial_scales[k_max - lvl]  # in reversed order
        slvl = str(lvl)

        if lvl == k_min:
            # Create conv ops with randomly initialized weights and
            # zeroed biases for the first FPN level; these will be shared by
            # all other FPN levels
            # RPN hidden representation
            conv_rpn_fpn = model.Conv(
                bl_in,
                'conv_rpn_fpn' + slvl,
                dim_in,
                dim_out,
                kernel=3,
                pad=1,
                stride=1,
                weight_init=gauss_fill(0.01),
                bias_init=const_fill(0.0)
            )
            model.Relu(conv_rpn_fpn, conv_rpn_fpn)
            # Proposal classification scores
            rpn_cls_logits_fpn = model.Conv(
                conv_rpn_fpn,
                'rpn_cls_logits_fpn' + slvl,
                dim_in,
                num_anchors,
                kernel=1,
                pad=0,
                stride=1,
                weight_init=gauss_fill(0.01),
                bias_init=const_fill(0.0)
            )
            # Proposal bbox regression deltas
            rpn_bbox_pred_fpn = model.Conv(
                conv_rpn_fpn,
                'rpn_bbox_

 
 
              
           
              
              
            
            相關推薦
			   
            
            
            
 

    

    
    Detectron研讀和實踐二：getting _started例子faster_rcnn_R-50-FPN
      
							
							
							

關於Detectron的介紹可以參看我的上一篇部落格。此篇部落格主要是對Detectron的getting_started例子faster_rcnn_R-50-FPN模型的相關程式碼進行分析。



1.相關原理簡介

該模型主要涉及兩個網路模組：基於Res 

  
 

    

    
    Detectron研讀和實踐一：detectron框架概覽
      
							
							
							引言

Detectron是Facebook AI研究院(FAIR)於2018年初公開的目前為止業內最佳水平的目標檢測平臺。據介紹，該專案自 2016 年 7 月啟動，構建於 Caffe2 之上，目前支援大量機器學習演算法，其中包括 Mask R-CNN（何愷明 

  
 

    

    
    【實戰】Docker入門實踐二：Docker服務基本操作 和 測試Hello World
      lag   hit   現在   mage   spa   關系   .so   recommend   不能   操作環境操作系統：CentOS7.2 內存：1GB CPU：2核Docker服務常用命令docker服務操作命令如下service docker start #啟動服務

service doc 

  
 

    

    
    .NET Core 實踐二：事件通知和非同步處理
      首先讓我們來先看一個例子： 
 
這是一個簡單的使用者下單購買商品的業務模型，輸入端是使用者，相關物料有訂單和貨物，相關的內部服務有業務（訂單）、財務（支付）、倉儲（備貨）和物流（運輸）。 
從圖中我們可以看到，使用者首先向業務部門下了一個訂單，業務部門根據使用者提供的內容生成了一份訂單給客戶，並要求客戶根據 

  
 

    

    
    .NET Core 實踐二：事件通知和異步處理
      要求   cor   丟失   金額   異常   運算   事務   ice   運維人員   首先讓我們來先看一個例子：

這是一個簡單的用戶下單購買商品的業務模型，輸入端是用戶，相關物料有訂單和貨物，相關的內部服務有業務（訂單）、財務（支付）、倉儲（備貨）和物流（運輸）。
從圖中我們可以看到，用戶首先向 

  
 

    

    
    視訊編解碼的理論和實踐2：Ffmpeg視訊編解碼
       
 
 近幾年，視訊編解碼技術在理論及應用方面都取得了重大的進展，越來越多的人想要了解編解碼技術。因此，網易雲信研發工程師為大家進行了歸納梳理，從理論及實踐兩個方面簡單介紹視訊編解碼技術。 
   
 相關閱讀推薦 
 《視訊直播關鍵技術：流暢、擁塞和延時追趕》 
 《視訊直播技術詳解：直播的推流 

  
 

    

    
    Zabbix最佳實踐二：快速入門
      一.登入與配置使用者 
1.1 登陸 
這是Zabbix的“歡迎”介面。輸入使用者名稱 Admin 以及密碼 zabbix 以作為 Zabbix超級使用者登陸。 
登陸後，你將會在頁面右下角看到“以管理員連線（Connected as Admin）”。同時會獲得訪問配置（Configuration） 和 管理 

  
 

    

    
    視訊編解碼的理論和實踐1：基礎知識介紹
      
                近幾年，視訊編解碼技術在理論及應用方面都取得了重大的進展，越來越多的人想要了解編解碼技術。因此，網易雲信研發工程師為大家進行了歸納梳理，從理論及實踐兩個方面簡單介紹視訊編解碼技術。



相關閱讀推薦









1、視訊介紹

視訊的本質是影象序列，根據視覺暫留的原理 

  
 

    

    
    Gradle理論與實踐二：Groovy介紹
       
 
  
  
 
 
  文章目錄
  
   
    Groovy介紹
    
     1、字串處理
     2、集合的宣告與操作
     
      List
      Map
     
     3、方法
     4、閉包
    
   
  
 
  
 Groovy介紹 
 

  
 

    

    
    Docker實踐(二)：容器的管理(建立、檢視、啟動、終止、刪除)
      
							
							
							docker官方文件地址如下：[https://docs.docker.com/engine/reference/](https://docs.docker.com/engine/reference/)




一、建立


docker create:建立容器 

  
 

    

    
    實踐二：caffe環境配置以及使用ssd-caffe訓練自己的資料集
      
							
							
							1：環境配置

首先，我們把專案程式碼clone下來, 然後編譯：

git clone https://github.com/weiliu89/caffe.git
cd caffe
git checkout ssd

檢視有沒有安裝opencv

pkg-co 

  
 

    

    
    Android開發實踐二：點選實踐
      
                res>layout中.xml檔案中的<Button>控制元件中新增android:id="@+id/button1"可在src>……>中.java檔案中通過btn1=(Button)this.findViewById(R.id.button1) 

  
 

    

    
    Android外掛化原理和實踐 (二) 之 載入外掛中的類程式碼
       
 
 
 我們在上一篇文章《Android外掛化原理和實踐 (一)之 外掛化簡介和基本原理簡述》中介紹了外掛化一些基本知識和歷史，最後還列出了三個根本問題。接下來我們打算圍繞著這三個根本問題展開對外掛化的學習。首先本章將介紹第一個根本問題：宿主和外掛中如何相互呼叫程式碼。要實現它們相互呼叫，就得要宿主先將 

  
 

    

    
    設計模式之感悟和實踐(二)
      前一篇《設計模式之感悟和實踐(一)》介紹瞭如何去掉if...else和switch...case的應用場景，這篇文章我們將介紹另外一種場景的綜合運用 
如果沒有看前一篇文章的，建議首先看一下。 
具體使用 
場景回憶 
前一篇文章我們是使用責任鏈設計模式教大家消除判斷語句，具體如下格式: 
//點選事件
-  

  
 

    

    
    JVM學習和分析(二)：GC
      
                
一、關於GC
　　GC是JAVA語言最重要的特性之一，GC為廣大JAVA程式設計師解決了記憶體管理的諸多問題，但GC是一把雙刃劍，在替程式設計師解決了記憶體管理的同時，也隱藏了很多細節，使JAVA程式設計師並不能像C程式設計師那樣對記憶體做到控制。因此，很多時候JAVA程式 

  
 

    

    
    尋覓Azure上的Athena和BigQuery (二)：神奇的PolyBase
       
 
前情回顧
在“資料湖”概念與理論逐漸深入人心的今天，面向雲端儲存的互動式查詢這個需求場景顯得愈發重要。這是因為原生的雲端儲存（主要指S3這樣的物件儲存）既能夠容納大容量的明細資料，又能在效能和成本間取得一個很好的平衡——如果它同 

  
 

    

    
    JSTL當中請給出一個c:choose和c:when：標籤的例子
       
 
 3）c:choose，c:when：標籤 完成類似java的case的功能： 例 2.2.3 <%@ page contentType="text/html; charset=GBK"%> <%@ taglib uri="http://java.sun.com/jsp/jstl/c 

  
 

    

    
    Python web入門：Django學習與實踐二（簡單頁面實現和建立一個模板）
      
                
一、第一個頁面實現（“hello world”）
實現步驟：
        1.在views.py檔案中建立一個處理函式（引數名可以隨意，但是最好使用request，看起來清楚明瞭）
                def   index（request）：
       

  
 

    

    
    Qt入門之基礎篇 ( 二 ) ：Qt項目建立、編譯、運行和發布過程解析
      qt 5   對話   讓我   進度   qmake   ctr   deploy   設定   設置   轉載請註明出處：CN_Simo。
題解：  

　　本篇內容主講Qt應用從創建到發布的整個過程，旨在幫助讀者能夠快速走進Qt的世界。
　　本來計劃是講解Qt源碼靜態編譯，如此的話讀者可能並不能清楚地知 

  
 

    

    
    Javascript設計模式與開發實踐詳解（二：策略模式） http://www.jianshu.com/p/ef53781f6ef2
      的人   思想   ram   gis   pan   pro   msg   have   改變   
上一章我們介紹了單例模式及JavaScript惰性單例模式應用這一次我主要介紹策略模式策略模式是定義一系列的算法，把它們一個個封裝起來，並且讓他們可以互相替換。比方說在現實中很多時候也有很多途徑到達同一個

Detectron研讀和實踐二：getting _started例子faster_rcnn_R-50-FPN

1.相關原理簡介

1.1 FPN特徵提取網路

1.2 Faster R-CNN檢測網路

2.相關原始碼分析

2.1 train_net.py

2.2 model_builder.py

2.3 FPN.py

Detectron研讀和實踐二：getting _started例子faster_rcnn_R-50-FPN

Detectron研讀和實踐一：detectron框架概覽

【實戰】Docker入門實踐二：Docker服務基本操作和測試Hello World

.NET Core 實踐二：事件通知和非同步處理

.NET Core 實踐二：事件通知和異步處理

視訊編解碼的理論和實踐2：Ffmpeg視訊編解碼

Zabbix最佳實踐二：快速入門

視訊編解碼的理論和實踐1：基礎知識介紹

Gradle理論與實踐二：Groovy介紹

Docker實踐(二)：容器的管理(建立、檢視、啟動、終止、刪除)

實踐二：caffe環境配置以及使用ssd-caffe訓練自己的資料集

Android開發實踐二：點選實踐

Android外掛化原理和實踐 (二) 之載入外掛中的類程式碼

設計模式之感悟和實踐(二)

JVM學習和分析(二)：GC

尋覓Azure上的Athena和BigQuery (二)：神奇的PolyBase

JSTL當中請給出一個c:choose和c:when：標籤的例子

Python web入門：Django學習與實踐二（簡單頁面實現和建立一個模板）

Qt入門之基礎篇 ( 二 ) ：Qt項目建立、編譯、運行和發布過程解析

Javascript設計模式與開發實踐詳解（二：策略模式） http://www.jianshu.com/p/ef53781f6ef2