Paper Review: FINN: A Framework for Fast, Scalable Binarized Neural Network Inference

阿新 • • 發佈：2018-12-11

FINN: A Framework for Fast, Scalable Binarized Neural Network Inference
FINN：一個用於建立高效能可擴充套件二值神經網路推測器的框架

基本資訊

發表日期：2016年12月
主要作者：Yaman Umuroglu；Nicholas J. Fraser；Giulio Gambardella
機構：Xilinx Research Labs；Norwegian University of Science and Technology；University of Sydney

主要內容

動機

基於CPU或者GPU的深度神經網路的計算開銷太大，動輒幾百MB(Meg Byte)的引數和幾個甚至幾十GFLOP，部分研究證明訓練好的神經網路模型中存在冗餘，其中一種冗餘就是精度冗餘，因此出現了低精度網路甚至二值網路。
FPGA特別適合二值資料的運算和儲存(low-precision arithmetic and small memory footprint)，可以達到TOPS的水平。

技術背景

CNN
BNN
four architecture for hardware implementation of NN
a single processing engine, usually in the form of systolic engine
streaming architecture, consisting one processing engine per network layer
vector processor with instructions specific to accelerating the primitive operations of convolutions

neurosynaptic processor

面向應用

需要實時處理的嵌入式系統

主要貢獻

Quantification of peak performance for BNNs on FPGAs using a roofline model.
A set of novel optimizations for mapping BNNs onto FPGA more efficiently.
A BNN architecture and accelerator construction tool, permitting customization of throughput.

A range of prototypes that demonstrate the potential of BNNs on an off-the-shelf FPGAs platform.

核心設計

A framework for mapping BNN to a flexible heterogeneous streaming structure
設計的主要內容：

architecture design: BNN對映在FPGA上的結構
BNN-specific operator optimization
popcount for accumulation: 用計數實現加法
Batchnorm-activation as threshold: 批歸一化和啟用用閾值實現
Boolen OR for Max-pooling: 布林或運算實現最大值池化操作
設計流程
硬體庫實現
Matrix-Vector-Threshold Unit:實現矩陣點乘
the sliding window unit for convolution: 用來為卷積操作編組輸入特徵圖資料的單元
pooling unit: 池化單元, OR邏輯+streaming buffer
Folding: 網路摺疊，摺疊的主要物件是MVTU

基於FPGA的BNN效能和精度評估

效能上作者提出了一個對比案例，用FPGA的roofline模型評估基於AlexNet結構的二值網路和8位網路，對比案例中兩種網路分類一張圖片需要的運算元都是一樣的1.4GOPS，然而兩種網路的引數量不同，二值網路的引數量只有7.4MB，8位網路則需要50MB，在FPGA的roofline模型中二值網路很佔優勢，首先是二值操作的峰值效能66TOPS就是8位數值計算的16倍，其次二值網路

Paper Review: FINN: A Framework for Fast, Scalable Binarized Neural Network Inference

基本資訊

主要內容

動機

技術背景

面向應用

主要貢獻

核心設計

基於FPGA的BNN效能和精度評估

Paper Review: FINN: A Framework for Fast, Scalable Binarized Neural Network Inference

Paper Review: fpgaConvNet--A Framework for Mapping Convolutional Neural Networks on FPGAs

MSCNN論文解讀-A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection

Holovive： A Framework For Connecting Virtual and Augmented Reality

Disc Jam Case Study: Supporting a Mission for Fast-Paced Competitive Gameplay with Amazon Gamelift

區塊鏈深入理解---BLOCKBENCH：A Framework for Analyzing Blockchains

Neuton: A new, disruptive neural network framework for AI applications

Bitcask:A Log-Structured Hash Table for Fast Key/Value Data 閱讀筆記

A JavaScript framework for functions of state and action

A Simple Framework for Designing Choices

turtleDB: A JavaScript Framework for building offline

CVPR2016之A Key Volume Mining Deep Framework for Action Recognition論文閱讀（視訊關鍵幀選取）

DRN: A Deep Reinforcement Learning Framework for News Recommendation學習

A Learning Based Framework for Depth Ordering

多標籤影象分類--HCP: A Flexible CNN Framework for Multi-Label Image Classification

【論文解讀】【半監督學習】【Google教你水論文】A Simple Semi-Supervised Learning Framework for Object Detection

hdu 1867 A + B for you again

poj 1087 A Plug for UNIX（字符串編號建圖）

java hdu A+B for Input-Output Practice (III)

Hdu 1091 A+B for Input-Output Practice (III)

Paper Review: FINN: A Framework for Fast, Scalable Binarized Neural Network Inference

基本資訊

主要內容

動機

技術背景

面向應用

主要貢獻

核心設計

基於FPGA的BNN效能和精度評估

相關推薦