YOLO3D端到端的3d物體檢測論文筆記

阿新 • • 發佈：2019-01-22

YOLO3D: End-to-end real-time 3D Oriented Object Bounding Box Detection from LiDAR Point Cloud

這篇論文將Yolo應用到 3D 物體檢測，在KITTI 資料集下利用Titan X GPU達到了40Fp的效能。

本文的主要貢獻有以下幾點：

1- Extending YOLO V2[3] to include orientation of the OBB as a direct regression task.

2- Extending YOLO V2[3] to include the height and 3D OBB center coordinates (x,y,z) as a direct regression task.

3- Real-time performance evaluation and experimentation with Titan X GPU, on the challenging KITTI benchmark, with recommendations of the best grid-map resolution, and operating IoU threshold that balances speed and accuracy.

Point Cloud Representation

首先將點陣雲投射到2D 鳥瞰網格圖中，總共建立了兩張圖，一張圖中的每個cell(pixel)的值為相關點的最高值；另一張圖的每個cell(pixel)的值為點的密度，每個網格cell中的點越多值越大。密度的計算方式跟MV3D paper一樣：

$min(1.0,\frac{log(N+1)}{log(64)})$

Yaw Angle Regression

預測框的方向角取值範圍為-π到π，歸一化為-1到1，並利用均方差計算損失函式：

$\sum_{i=0}^{s^{2}}\sum_{j=0}^{B}L_{ij}^{obj}\left ( \phi_{i} - \hat{\phi_{i}}} \right )^{^{2}}$

3D Bounding Box Regression

這一部分更Yolo_V2一樣，只是擴充套件到了三維。唯一要注意的是高度Z的值只對映到一個網格中，而不是像xy一樣對映到所有網格，這是由於物體的高度相差不大，可變度非常小。

$b_{x}=\sigma (t_{x})+c_{x}$

$b_{y}=\sigma (t_{y})+c_{y}$

$b_{z}=\sigma (t_{z})+c_{z}$

$b_{w}=p_{w}e^{t_{w}}$

$b_{l}=p_{l}e^{t_{l}}$

$b_{h}=p_{h}e^{t_{h}}$

Anchors Calculation

Yolo_v2中利用K均值聚類得到了很多大小不一的Anchors，基於這樣的先驗知識能夠覆蓋到資料可能出現的所有範圍的框，這樣可以利用不同大小的框檢測到不同大小的物體。然後汽車的大小相對來說比較固定，所以本文實現沒有利用K均值聚類產生大小不同的先驗框，而是計算3D boxs的均值作為先驗框的大小。

Combined Loss for 3D OBB

總體的Loss加了幾個維度，其他處理一樣。

Network Architecture and Hyper Parameters

相比於yolo_v2網路結構的一些改動：

We modified one max-pooling layer to change the down-sampling from 32 to 16 so we can have a larger grid at the end; this has a contribution in detecting small objects like pedestrians and cyclists.
We removed the skip connection from the model as we found it resulting in less accurate results.
We added terms in the loss function for yaw, z center coordinate, and height regressions to facilitate the 3D oriented bounding box detection.
Our input consists of 2 channels, one representing the maximum height, and the other one representing the density of points in the point cloud, computed as shown in Eq. (1)

KITTI Results and Error Analysis

對於Car，當IOU閾值在0.5時表現得很好，當大於0.5之後，隨著IOU閾值的增加，效能顯著下降，這表明我們很難讓盒子與物件完美對齊，這是Yolo模型普遍存在的問題。

隨著影象解析度的增加，預測推理時間顯著增加，如0.15m/piexl增加的0.1/piexl推理時間增加了大約一倍。

YOLO3D端到端的3d物體檢測論文筆記

YOLO3D: End-to-end real-time 3D Oriented Object Bounding Box Detection from LiDAR Point Cloud

Point Cloud Representation

Yaw Angle Regression

3D Bounding Box Regression

Anchors Calculation

Combined Loss for 3D OBB

Network Architecture and Hyper Parameters

KITTI Results and Error Analysis

YOLO3D端到端的3d物體檢測論文筆記

tensorflow移植到Android端，實現物體檢測自動拍照

運動物體檢測論文（1）

運動物體檢測論文（2）

行人檢測論文筆記：Robust Real-Time Face Detection

論文筆記——基於網絡的端到端可訓練任務導向對話系統

VoxelNet：基於點雲的三維物體檢測的端到端學習

【論文筆記】視訊物體檢測(VID)系列 NoScope:1000x的視訊檢索加速演算法

【論文筆記】視訊物體檢測(VID)系列 FGFA：Flow-Guided Feature Aggregation for Video Object Detection

cocos 射線檢測 3D物體 (Sprite3D點擊)

論文筆記：目標檢測演算法（R-CNN，Fast R-CNN，Faster R-CNN，YOLOv1-v3）

物體檢測與識別——學習筆記

深度學習實戰（1）--手機端跑YOLO目標檢測網路（從DarkNet到Caffe再到NCNN完整打通）

【論文筆記】用形狀做擋風玻璃上的雨滴檢測《Detection Of Raindrop With Various Shapes On A Windshield》

服務端模版注入漏洞檢測payload整理

深度學習【50】物體檢測：SSD: Single Shot MultiBox Detector論文翻譯

SSD: Single Shot MultiBox Detector 深度學習筆記之SSD物體檢測模型

【論文筆記】3D人臉重建_簡略版（時時更新中）

從客戶端(txtContent="1")中檢測到有潛在危險的 Request.Form 值

如何在HTTP客戶端與伺服器端之間保持狀態？總結筆記

YOLO3D端到端的3d物體檢測 論文筆記

YOLO3D: End-to-end real-time 3D Oriented Object Bounding Box Detection from LiDAR Point Cloud

Point Cloud Representation

Yaw Angle Regression

3D Bounding Box Regression

Anchors Calculation

Combined Loss for 3D OBB

Network Architecture and Hyper Parameters

KITTI Results and Error Analysis

相關推薦

YOLO3D端到端的3d物體檢測論文筆記