Matlab 應用GPU加速

阿新 • • 發佈：2018-12-31

由於GPU近幾年地迅速發展，GPU在多執行緒計算等方面逐漸超越CPU成為計算的主力軍。而Matlab是常用的數學應用軟體，現在講解一下如何在Matlab中使用GPU加速計算

0. 必要條件

要想在Matlab中使用GPU加速有兩個必須的條件

計算機上安裝了NVIDIA顯示卡，目前AMD與Intel顯示是暫不支援的；
安裝NVIDIA顯示卡驅動。

1.給GPU傳輸資料

1.1 CPU的資料複製到GPU

在使用GPU計算的時候，只需要將CPU的資料複製到GPU中即可。

G = gpuArray(M);

上邊是對資料的名稱做了修改，也可以直接進行重新賦值。

M  
= gpuArray(M);

1.2 直接在GPU上設定資料：

A = zeros(10, 'gpuArray');

可以對0矩陣以及1矩陣直接進行復制，但是在程式後邊需要標註使用gpuArray。

r = gpuArray.rand(1, 100) % 一行，一百列

隨機矩陣的產生。

2.資料在GPU上運算

在GPU可以正常執行基本的運算，與正常矩陣計算方法相同

A=abs(A);

具體的可以執行的運算可以使用命令

methods(gpuArray)

進行檢視，Matlab可以在GPU執行的具體運算可以檢視附錄，附錄中是Matlab給出的結果。

3.GPU資料回傳

B = gather 
 (A);

直接使用上邊的命令就能夠將GPU中的資料回傳給CPU。

4.使用技巧

4.1 如果沒有Nvidia顯示卡或者顯示卡驅動

如果沒有Nvidia顯示卡或者顯示卡驅動，會顯示下邊的提示。

這裡寫圖片描述

4.２雙精度儘量轉換為單精度

在條件允許的情況下，儘量將計算過程中雙精度轉換為單精度。因為在GPU中單精度的計算速度明顯優於雙精度，在時間上會有很大的縮減。
附：單精度與上精度區別

資料型別	大小(位元組)	取值範圍	保留有效位數
單精度	4個位元組（32位）	3.4E-38～3.4E+38	7位
雙精度	8個位元組（64位）	1.7E-308～1.7E+308	16位

附錄

>> methods(gpuArray)

Methods for class gpuArray:

abs                   eq                    ipermute              quiver3               
accumarray            erf                   iradon                rad2deg               
acos                  erfc                  isaUnderlying         radon                 
acosd                 erfcinv               isbanded              rdivide               
acosh                 erfcx                 isdiag                real                  
acot                  erfinv                isempty               reallog               
acotd                 errorbar              isequal               realpow               
acoth                 existsOnGPU           isequaln              realsqrt              
acsc                  exp                   isequalwithequalnans  reducepatch           
acscd                 expint                isfinite              reducevolume          
acsch                 expm                  isfloat               regionprops           
all                   expm1                 ishermitian           rem                   
and                   eye                   isinf                 repelem               
angle                 ezcontour             isinteger             repmat                
any                   ezcontourf            islogical             reshape               
applylut              ezgraph3              ismember              rgb2gray              
area                  ezmesh                ismembertol           rgb2hsv               
arrayfun              ezmeshc               isnan                 rgb2ycbcr             
asec                  ezplot                isnumeric             ribbon                
asecd                 ezplot3               isocaps               roots                 
asech                 ezpolar               isocolors             rose                  
asin                  ezsurf                isonormals            rot90                 
asind                 ezsurfc               isosurface            round                 
asinh                 factorial             isreal                scatter               
assert                false                 issorted              scatter3              
atan                  feather               issparse              sec                   
atan2                 fft                   issymmetric           secd                  
atan2d                fft2                  istril                sech                  
atand                 fftfilt               istriu                semilogx              
atanh                 fftn                  kmeans                semilogy              
bandwidth             fill                  knnsearch             setdiff               
bar                   fill3                 ldivide               setxor                
bar3                  filter                le                    shiftdim              
bar3h                 filter2               legendre              shrinkfaces           
barh                  find                  length                sign                  
besselj               fix                   line                  sin                   
bessely               flip                  linspace              sind                  
beta                  flipdim               log                   single                
betainc               fliplr                log10                 sinh                  
betaincinv            flipud                log1p                 size                  
betaln                floor                 log2                  slice                 
bicg                  fplot                 logical               smooth3               
bicgstab              fprintf               loglog                sort                  
bicgstabl             full                  logspace              sortrows              
bitand                gamma                 lsqr                  sparse                
bitcmp                gammainc              lt                    spfun                 
bitget                gammaincinv           lu                    spones                
bitor                 gammaln               mat2gray              sprand                
bitset                gather                mat2str               sprandn               
bitshift              ge                    max                   sprandsym             
bitxor                gmres                 mean                  sprintf               
bsxfun                gop                   medfilt2              spy                   
bwdist                gpuArray              mesh                  sqrt                  
bwlabel               gradient              meshc                 stairs                
bwlookup              gt                    meshgrid              std2                  
bwmorph               head                  meshz                 stdfilt               
cast                  hist                  min                   stem                  
cat                   histc                 minres                stem3                 
cconv                 histcounts            minus                 stream2               
cdf2rdf               histeq                mldivide              stream3               
ceil                  histogram             mod                   streamline            
cgs                   horzcat               mode                  streamparticles       
chol                  hsv2rgb               movmean               streamribbon          
circshift             hypot                 movstd                streamslice           
clabel                idivide               movsum                streamtube            
classUnderlying       ifft                  movvar                stretchlim            
comet                 ifft2                 mpower                sub2ind               
comet3                ifftn                 mrdivide              subsasgn              
compass               im2double             mtimes                subsindex             
complex               im2int16              nan                   subspace              
cond                  im2single             ndgrid                subsref               
coneplot              im2uint16             ndims                 subvolume             
conj                  im2uint8              ne                    sum                   
contour               imabsdiff             nextpow2              superiorfloat         
contour3              imadjust              nnz                   surf                  
contourc              imag                  nonzeros              surfc                 
contourf              image                 norm                  surfl                 
contourslice          imagesc               normest               svd                   
conv                  imbothat              normxcorr2            svds                  
conv2                 imclose               not                   swapbytes             
convn                 imcomplement          nthroot               symmlq                
corr2                 imdilate              null                  tail                  
corrcoef              imerode               num2str               tan                   
cos                   imfill                numel                 tand                  
cosd                  imfilter              nzmax                 tanh                  
cosh                  imgaussfilt           ones                  tfqmr                 
cot                   imgaussfilt3          or                    times                 
cotd                  imgradient            padarray              transpose             
coth                  imgradientxy          pagefun               trapz                 
cov                   imhist                pareto                tril                  
csc                   imlincomb             patch                 trimesh               
cscd                  imnoise               pcg                   trisurf               
csch                  imopen                pcolor                triu                  
ctranspose            imreconstruct         pdist                 true                  
cummax                imregdemons           pdist2                typecast              
cummin                imregionalmax         permute               uint16                
cumprod               imregionalmin         pie                   uint32                
cumsum                imresize              pie3                  uint64                
curl                  imrotate              planerot              uint8                 
deg2rad               imrotate_old          plot                  uminus                
del2                  imshow                plot3                 union                 
det                   imtophat              plotmatrix            unique                
detectFASTFeatures    ind2sub               plotyy                uniquetol             
detectHarrisFeatures  inf                   plus                  unwrap                
detrend               inpolygon             polar                 uplus                 
diag                  int16                 poly                  var                   
diff                  int2str               polyder               vertcat               
discretize            int32                 polyfit               vissuite              
disp                  int64                 polyval               volumebounds          
display               int8                  polyvalm              voronoi               
divergence            interp1               pow2                  waterfall             
dot                   interp2               power                 xcorr                 
double                interp3               prod                  xor                   
edge                  interpn               psi                   ycbcr2rgb             
eig                   interpstreamspeed     qmr                   zeros                 
end                   intersect             qr                    
eps                   inv                   quiver                

Static methods:

colon                 rand                  randperm              
freqspace             randi                 speye                 
loadobj               randn

Matlab 應用GPU加速

由於GPU近幾年地迅速發展，GPU在多執行緒計算等方面逐漸超越CPU成為計算的主力軍。而Matlab是常用的數學應用軟體，現在講解一下如何在Matlab中使用GPU加速計算 0. 必要條件要想在Matlab中使用GPU加速有兩個必須的條件計算機上

MATLAB上的GPU加速計算——學習筆記 (2014-12-22 04:44:05)

轉自：http://blog.sina.com.cn/s/blog_6f062c360102v9ic.html MATLAB可謂工程計算中的神器，一方面它自帶豐富的函式庫，另一方面它所有的資料都是內建的矩陣型別，最後畫圖也方便，因此解決一些小規模的計算問題如果對效能要求不高的話

MATLAB上的GPU加速計算

【時間】2018.10.12 【題目】MATLAB上的GPU加速計算概述怎樣在MATLAB上做GPU計算呢?主要分為三個步驟：資料的初始化、對GPU資料進行操作、把GPU上的資料回傳給CPU 一、資料的初始化首先要進行資料的初始化。有兩種

MATLAB GPU加速

以前使用matlab的時候，很多人都用過裡面的並行工具箱，用的最多的應該就是parfor。實際上，matlab裡面已經有不少工具箱裡面都有了支援GPU加速的函式。使用matlab+GPU加速的前提是，機器必須安裝了支援CUDA的顯示卡，而且CUDA驅動的版本在1.3以上。一些

matlab 中使用 GPU 加速運算

views mark amp int 希望 style ont color col 為了提高大規模數據處理的能力，matlab 的 GPU 並行計算，本質上是在 cuda 的基礎上開發的 wrapper，也就是說 matlab 目前只支持 NVIDIA 的顯卡。 1.

Windows 10下安裝配置Caffe並支持GPU加速(改)

nvi 基本一個應該添加它的右鍵分享圖片 vid 基本環境建議嚴格按照版本來 - Windows 10 - Visual Studio 2013 - Matlab R2016b - Anaconda - CUDA 8.0.44 - cuDN

CSS動畫的效能分析和瀏覽器GPU加速

此文已由作者袁申授權網易雲社群釋出。歡迎訪問網易雲社群，瞭解更多網易技術產品運營經驗。有數的資料大屏可以在一塊螢幕上展示若干張不同的圖表，以炫酷的方式展示各種業務資料。其中有些圖表使用CSS實現了餅圖輪播、地圖示記點閃爍等動畫，然而在一張大屏上同時顯示了許多張圖表時，持續的動畫效果有時會出現掉幀、卡頓的

中國電信聯合諾基亞、英特爾展示5G創新應用，加速5G商用落地

上海2018年11月6日電 /美通社/ -- 在首屆中國國際進口博覽會（China International Import Expo，簡稱“進博會”）上，中國電信聯合諾基亞和英特爾共同演示了一系列基於5G端到端技術的解決方案。這一系列極具突破性的成果展示，旨在將5G引入行業生態系統的創

NVIDIA針對大規模資料分析和機器學習推出RAPIDS開源GPU加速平臺！

2018年10月10日，NVIDIA釋出了一款針對資料科學和機器學習的GPU加速平臺，該平臺已為多個行業領先者所採用，並能幫助超大規模公司以前所未有的速度分析海量資料並進行精準的業務預測。 RAPIDS™ 開源軟體幫助資料科學家顯著地提高了工作績效，對於這些資料科學家來說，種種業務挑戰應接不暇，

什麼是GPU 加速？

1、什麼是GPU加速計算 GPU，又稱顯示核心、視覺處理器、顯示晶片，是一種專門在個人電腦、工作站、遊戲機和一些移動裝置（如平板電腦、智慧手機等）上影象運算工作的微處理器，與CPU類似，只不過GPU是專為執行復雜的數學和幾何計算而設計的，這些計算是圖形渲染所必需的。隨著人工智慧的發展

OPENCV（opencv2和opencv3.3）用GPU加速

OpenCV3.1 使用GPU及OpenCL加速的教程 OpenCV內部很多函式都已經實現了GPU加速, 新發布的OpenCV3.0版本很方便的解決了這個問題，只要你使用UMat即可。 cuda初始化需要時間，而且你傳入cuda也有時間。首先你得說一下你是直接用的Cuda Runtim

ubuntu16.04下opencv安裝專欄，問題集錦，包括GPU加速

參考：https://blog.csdn.net/cocoaqin/article/details/78163171 參考：https://blog.csdn.net/cocoaqin/article/details/78376382?utm_source=debugrun&utm_me

毫秒級檢測！你見過帶GPU加速的樹莓派嗎？

樹莓派3B+英特爾神經計算棒進行高速目標檢測轉載請註明作者夢裡茶 NCS Pi 程式碼: 訓練資料預處理：https://gist.github.c

檢視GPU佔用率以及指定GPU加速程式

GPU佔用率檢視：方法一：工作管理員如圖，GPU0和GPU1的佔用率如下顯示。方法二：GPU-Z軟體下面兩個GPU，上面是GPU0，下面是GPU1 sensors會話框裡的GPU Load就

【Python-GPU加速】基於Numba的GPU計算加速（一）基本

Numba是一個可以利用GPU/CPU和CUDA 對python函式進行動態編譯，大幅提高執行速度的加速工具包。利用修飾器@jit,@cuda.jit,@vectorize等對函式進行編譯 JIT：即時編譯，提高執行速度基於特定資料型別

配置深度學習GPU加速（Cuda以及Cudnn安裝，win10作業系統下）

https://blog.csdn.net/hejunqing14/article/details/76059603 https://blog.csdn.net/Captain_F_/article/details/79171332 https://blog.csdn.net/xuyanan

記錄一次Python下Tensorflow安裝過程，1.7帶GPU加速版本

最近由於論文需要，急需搭建Tensorflow環境，16年底當時Tensorflow版本號還沒有過1，我曾按照手冊搭建過CPU版本。目前，1.7算是比較新的版本了（也可以從原始碼編譯1.8版本的Tensorflow）。安裝步驟：不能急於求成，安裝任何東西前都應該先閱讀使用者手冊與FAQ，弄清軟體依賴與安裝

windows下配置gpu加速——cuda與cudnn安裝

windows下配置gpu加速——cuda與cudnn安裝一、系統情況二、安裝工具準備三、工具安裝 1、顯示卡驅動安裝 2、cuda9.0安裝 3、cudnn9.0安裝 4、vs2015安裝四、

GPU加速原理

原文：https://blog.csdn.net/weiweigfkd/article/details/23051255 GPU加速技術&原理介紹 1、GPU&CPU GPU英文全稱Graphic Processing Unit，中文翻譯

用GPU加速tensorflow

TensorFlow程式可以通過tf.device函式來指定執行每一個操作的裝置，這個裝置可以是本地的CPU或者GPU，也可以是某一臺遠端的伺服器。但在本文中只關心本地的裝置。TensorFlow會給每一個可用的裝置一個名稱，tf.device函式可以通過裝置的名稱來指定執行運算的裝置。比如CPU在

Matlab 應用GPU加速

0. 必要條件

1.給GPU傳輸資料

1.1 CPU的資料複製到GPU

1.2 直接在GPU上設定資料：

2.資料在GPU上運算

3.GPU資料回傳

4.使用技巧

4.1 如果沒有Nvidia顯示卡或者顯示卡驅動

4.２ 雙精度儘量轉換為單精度

附錄

相關推薦

4.２雙精度儘量轉換為單精度