YOLO V2 代碼分析

阿新 • • 發佈：2018-07-02

blog 不同的 backward -s .com index span ret info

3.3 passthrough操作

regorg layer分析：這裏ReorgLayer層就是將 $26 * 26 * 512$

1 #darknet.py
2         self.reorg = ReorgLayer(stride=2)   # stride*stride times the channels of conv1s

 1 #reorg_layer.py
 2     def forward(self, x): 
 3         stride = self.stride
 
 4 
 5         bsize, c, h, w = x.size()
 6         out_w, out_h, out_c = int(w / stride), int(h / stride), c * (stride * stride)
 7         out = torch.FloatTensor(bsize, out_c, out_h, out_w)
 8 
 9         if x.is_cuda:
10             out = out.cuda()
11             reorg_layer.reorg_cuda(x, out_w, out_h, out_c, bsize, stride, 0, out)
 
12         else:
13             reorg_layer.reorg_cpu(x, out_w, out_h, out_c, bsize, stride, 0, out)
14 
15         return out

 1 //reorg_cpu.c
 2 int reorg_cpu(THFloatTensor *x_tensor, int w, int h, int c, int batch, int stride, int forward, THFloatTensor *out_tensor)
 3 {
 4     // Grab the tensor 

 5     float * x = THFloatTensor_data(x_tensor);
 6     float * out = THFloatTensor_data(out_tensor);
 7 
 8     // https://github.com/pjreddie/darknet/blob/master/src/blas.c
 9     int b,i,j,k;
10     int out_c = c/(stride*stride);
11 
12     for(b = 0; b < batch; ++b){
13         //batch_size
14         for(k = 0; k < c; ++k){
15            //channel
16             for(j = 0; j < h; ++j){
17                 //height
18                 for(i = 0; i < w; ++i){
19                     //width
20                     int in_index  = i + w*(j + h*(k + c*b));
21                     int c2 = k % out_c;
22                     int offset = k / out_c;
23                     int w2 = i*stride + offset % stride;
24                     int h2 = j*stride + offset / stride;
25                     int out_index = w2 + w*stride*(h2 + h*stride*(c2 + out_c*b));
26                     if(forward) out[out_index] = x[in_index]; // 壓縮channel
27                     else out[in_index] = x[out_index];        // 擴展channel
28                 }
29             }
30         }
31     }
32 
33     return 1;
34 }

圖片有錯誤，待改，輸入的1,3點分布在輸出的第1個feature map上，輸入的2,4點分布在輸出的第2個feature map上，idx2後面+w2

下圖從右到左為forward計算方向，從左到右為backward求導方向

技術分享圖片

3.4 目標函數計算

 1 #darknet.py
 2     def loss(self):
 3         #可以看出，損失值也是基於預測框bbox，預測的iou，分類三個不同的誤差和
 4         return self.bbox_loss + self.iou_loss + self.cls_loss
 5 
 6     def forward(self, im_data, gt_boxes=None, gt_classes=None, dontcare=None):
 7         conv1s = self.conv1s(im_data)
 8         conv2 = self.conv2(conv1s)
 9         conv3 = self.conv3(conv2)
10         conv1s_reorg = self.reorg(conv1s)
11         cat_1_3 = torch.cat([conv1s_reorg, conv3], 1)
12         conv4 = self.conv4(cat_1_3)
13         conv5 = self.conv5(conv4)   # batch_size, out_channels, h, w
14         ……
15         ……
16         # tx, ty, tw, th, to -> sig(tx), sig(ty), exp(tw), exp(th), sig(to)
17         ‘‘‘預測tx ty‘‘‘
18         xy_pred = F.sigmoid(conv5_reshaped[:, :, :, 0:2])
19         ‘‘‘預測tw th ‘‘‘
20         wh_pred = torch.exp(conv5_reshaped[:, :, :, 2:4])
21         bbox_pred = torch.cat([xy_pred, wh_pred], 3)
22         ‘‘‘預測置信度to ‘‘‘
23         iou_pred = F.sigmoid(conv5_reshaped[:, :, :, 4:5])
24         ‘‘‘預測分類class  ‘‘‘
25         score_pred = conv5_reshaped[:, :, :, 5:].contiguous()
26         prob_pred = F.softmax(score_pred.view(-1, score_pred.size()[-1])).view_as(score_pred)
27 
28         # for training
29         if self.training:
30             bbox_pred_np = bbox_pred.data.cpu().numpy()
31             iou_pred_np = iou_pred.data.cpu().numpy()
32             _boxes, _ious, _classes, _box_mask, _iou_mask, _class_mask = self._build_target(
33                                          bbox_pred_np, gt_boxes, gt_classes, dontcare, iou_pred_np)
34             _boxes = net_utils.np_to_variable(_boxes)
35             _ious = net_utils.np_to_variable(_ious)
36             _classes = net_utils.np_to_variable(_classes)
37             box_mask = net_utils.np_to_variable(_box_mask, dtype=torch.FloatTensor)
38             iou_mask = net_utils.np_to_variable(_iou_mask, dtype=torch.FloatTensor)
39             class_mask = net_utils.np_to_variable(_class_mask, dtype=torch.FloatTensor)
40 
41             num_boxes = sum((len(boxes) for boxes in gt_boxes))
42 
43             # _boxes[:, :, :, 2:4] = torch.log(_boxes[:, :, :, 2:4])
44             box_mask = box_mask.expand_as(_boxes)
45             #計算預測的平均bbox損失值
46             self.bbox_loss = nn.MSELoss(size_average=False)(bbox_pred * box_mask, _boxes * box_mask) / num_boxes
47            #計算預測的平均iou損失值
48             self.iou_loss = nn.MSELoss(size_average=False)(iou_pred * iou_mask, _ious * iou_mask) / num_boxes
49            #計算預測的平均分類損失值
50             class_mask = class_mask.expand_as(prob_pred)
51             self.cls_loss = nn.MSELoss(size_average=False)(prob_pred * class_mask, _classes * class_mask) / num_boxes
52 
53         return bbox_pred, iou_pred, prob_pred

參考自：仙守

YOLO V2 代碼分析

blog 不同的 backward -s .com index span ret info 3.3 passthrough操作 regorg layer分析：這裏ReorgLayer層就是將26∗26∗512的張量中26∗26切割

首選項框架PreferenceFragment部分源代碼分析

mit 系統 repl 原理網絡 intern popu todo array 由於要改一些settings裏面的bug以及之前在裏面有做過勿擾模式，準備對勿擾模式做一個總結，那先分析一下settings的源代碼，裏面的核心應該就是android3.0 上

TS流之代碼分析

xtra new 校正 reat ted 跟著 ror enable 好的　　代碼分析前，先要了解TS流基本概念：TS流之基本概念。　　VLC解析TS流是通過libts庫來分離的，libts庫使用libdvbpsi庫來解TS表。 1. libts庫在加載的時候，會將以下

Spark SQL 源代碼分析之Physical Plan 到 RDD的詳細實現

local 過濾右連接操作 images img mem sans 觀察 /** Spark SQL源代碼分析系列文章*/ 接上一篇文章Spark SQL Catalyst源代碼分析之Physical Plan。本文將介紹Physical Plan的toRDD的

【Java】【Flume】Flume-NG啟動過程源代碼分析（一）

code extends fix tar top 依據 oid article gif 從bin/flume 這個shell腳本能夠看到Flume的起始於org.apache.flume.node.Application類，這是flume的main函數所在。　　m

JAVA隨筆篇一（Timer源代碼分析和scheduleAtFixedRate的使用）

exce 啟動 get stat dsm ldr 基礎篇 ask pty 寫完了基礎篇，想了非常久要不要去寫進階篇。去寫JSP等等的用法。最後決定先不去寫。由於自己並非JAVA方面的大牛。眼下也在邊做邊學，所以決定先將自己不懂的拿出來學並記下來。 Timer是Java自

Zepto源代碼分析之二~三個API

isa bsp scrip shee 字符串 ng- add child fin 因為時間關系：本次僅僅對這三個API（$.camelCase、$.contains、$.each）方法進行分析第一個方法變量轉駝峰：$.camelCase(‘hello-world-

for 循環代碼分析 --基礎

clas system string log sys bsp 理解 blog 基礎這段代碼的意義進一步　　理解continue 和break 一段代碼的分析 class A { public static void main(String[] args) {

gcc 源代碼分析-前端篇3

com 初始一個語言 filename 名稱分析源碼 ng- 3. GCC怎樣函表示一個函數對c語言來說。函數是其核心，全部的東西都在環繞著函數在轉。對於一個函數來說。它基本的一些特性例如以下： 1. 有一個返回值，在這裏我們沒有把返回值的函數覺得

Storm入門（十一）Twitter Storm源代碼分析之CoordinatedBolt

OAuth2.0學習（4-1）Spring Security OAuth2.0 - 代碼分析

endpoint manager authent work cor tro 過程 pro efi 1、org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter

接口代碼分析

system col 代碼分析實現 pre 實例化指定 [] style package SS; abstract interface A {// 定義一個接口A public static final String MSG = "hell

OpenStack_Swift源代碼分析——ObjectReplicator源代碼分析(1)

ini log tar spa uri () oca bug period 1、ObjectorReplicator的啟動首先執行啟動腳本 swift-init object-replicator start此執行腳本的執行過程和ring執行腳本執行過程差

guava eventbus代碼分析(二)

.get 實現類 ava bject () sync 技術 cdi alt ---恢復內容開始--- 我們分析下EventBus的核心方法 post方法，直接貼代碼 1 public void post(Object event) { 2 Iterator

HBase源代碼分析之HRegion上MemStore的flsuh流程（二）

初始化 back represent 代碼分析讀數 ott pass expect 出現異常繼上篇《HBase源代碼分析之HRegion上MemStore的flsuh流程（一）》之後。我們繼續分析下HRegion上MemStore flush的核心方

【雷電】源代碼分析（二）-- 進入遊戲攻擊

engine 場景 aud 初始 cto onf 不變 addchild ems 效果圖：程序分析：初始化GameLayer場景觸摸。背景、音樂、UI及定時間器 bool GameLayer::init() { if (!CCLayer::init())

[Android]Fragment源代碼分析(三) 事務

gin == ted n) 源代碼 actions because comm 承擔 Fragment管理中,不得不談到的就是它的事務管理,它的事務管理寫的很的出彩。我們先引入一個簡單經常使用的Fragment事務管理代碼片段： Frag

Spring源代碼分析（1）---LocalSessionFactoryBean(工廠的工廠)

self action interface 開始 environ mac hbm upd put LocalSessionFacotoryBean其實就是適配了Configuration對象，或者說是一個工廠的工廠，他是Configuration的工廠，生成了Configu

SylixOS 調試方法詳解——靜態代碼分析

sylixos 調試方法1. SylixOS調試方法介紹SylixOS 實現了一個功能強大的調試 stub，可在設備或模擬器上在線調試應用程序，RealEvo-IDE 也提供配套的調試插件。目前 RealEvo-IDE 既支持自動推送調試、也支持傳統的手動啟動 gdbserver 的調試方式。在官方公布的使用

Shiro源代碼分析之兩種Session的方式

amp msg cto 開源 request cannot pad turn ssa 1、Shiro默認的Session處理方式  <bean id="s

YOLO V2 代碼分析

3.3 passthrough操作

regorg layer分析：這裏ReorgLayer層就是將26∗26∗512的張量中26∗26切割成4個13∗13，然後連接起來，使得原來的512通道變成了2048。

3.4 目標函數計算

相關推薦

regorg layer分析：這裏ReorgLayer層就是將 $26 * 26 * 512$