資料預處理練習（深度學習）

阿新 • • 發佈：2019-02-19

轉載自：

作者：tornadomeet 出處：http://www.cnblogs.com/tornadomeet 歡迎轉載或分享，但請務必宣告文章出處。（新浪微博：tornadomeet,歡迎交流！）

前言:

　　本節主要是來練習下在machine learning(不僅僅是deep learning)設計前的一些資料預處理步驟，關於資料預處理的一些基本要點在前面的博文Deep learning：三十(關於資料預處理的相關技巧)中已有所介紹，無非就是資料的歸一化和資料的白化，而資料的歸一化又分為尺度歸一化，均值方差歸一化等。資料的白化常見的也有PCA白化和ZCA白化。

實驗基礎：

上下載。關於該ASL資料庫的一些簡單特徵：

　　該資料為24個字母（字母j和z的手勢是動態的，所以在這裡不予考慮）的手勢靜態圖片庫，每個操作者以及每個字母都有顏色圖和深度圖，訓練和測試資料一起約2.2G（其實因為它是8bit的整型，後面在matlab處理中一般都會轉換成浮點數，所以總共的資料大約10G以上了）。

　　這些手勢圖片是用kinect針對不同的5個人分別採集的，每個人採集24個字母的影象各約500張，所以顏色圖片總算大約為24*5*500=60k。當然了，這只是個大概數字，應該並不是每個人每個字母嚴格的500張，另外深度影象和顏色影象一樣多，也大概是60k。而該資料庫的作者是用一半的圖片來訓練，另一半用來測試。顏色圖和深度圖都用了。所以至少每次也用了3w張圖片，每張圖片都是上千維的，資料量有點大。

　　另外發現所有資料庫中顏色圖片的第一張缺失，即是從第二張圖片開始的。所以將其和kinect對應時要非常小心，並且中間有些圖片是錯的，比如說有的資料夾中深度圖和顏色圖的個數就不相等。並且原圖的rgb圖是8bit的，而depth圖是16bit的。通常所說的檔案大小指的是位元組大小，即byte；而一般所說的傳輸速率指的是位大小，即bit。

　　ASL資料庫的部分圖片如下：

　　一些matlab知識：

　　在matlab中，雖然說幾個矩陣的大小相同，也都是浮點數型別，但是由於裡面的內容（即元素值）不同，所以很有可能其佔用的檔案大小不同。

　　Imagesc和imshow在普通rgb影象使用時其實沒什麼區別，只不過imagesc顯示的時候把標籤資訊給顯示出來了。

　　dir：

　　列出資料夾內檔案的內容，只要列出的資料夾中有一個子資料夾，則其實代表了有至少有3個子資料夾。其中的’.’和’..’表示的是當前目錄和上一級的目錄。

　　load:

　　不加括號的load時不能接中間變數，只能直接給出檔名

　　sparse:

　　這個函式中引數必須為正數，因為負數或0是不能當下標的。

　　實驗結果：

　　這次實驗主要是完成以下3個小的預處理功能。

　　第一：將圖片尺度歸一化到96*96大小，因為給定的圖片大小都不統一，所以只能取個大概的中間尺寸值。且將每張圖片變成一個列向量，多個圖片樣本構成一個矩陣。因為這些圖片要用於訓練和測試，按照作者的方法，將訓練和測試圖片分成2部分，且每部分包含了rgb顏色圖，灰度圖，kinect深度圖3種，由於資料比較大，所以每個採集者（總共5人）又單獨設為一組。因此生產後的尺度統一圖片共有30個。其中的部分檔案顯示如下：

　　第二：因為要用訓練部分影象來訓練deep learning某種模型，所以需要提取出區域性patch（10*10大小）樣本。此時的訓練樣本有3w張，每張提取出10個patch，總共30w個patch。

　　第三：對這些patch樣本進行資料白化操作，用的普通的ZCA白化。

　　實驗主要部分程式碼及註釋：

　　下面3個m檔案分別對應上面的3個小步驟。

img_preprocessing.m:

%% data processing:
% translate the picture sets to the mat form
% 將手勢識別的圖片資料庫整理成統一的大小（這裡是96*96），然後變成1列，最後轉換成矩陣的形式，每個採集者的
% 資料單獨放好（共ABCDE5人），為了後續實驗的需要，分別儲存了rgb顏色圖，灰度圖和深度圖3種類型

%add the picture path
addpath c:/Data
addpath c:/Data/fingerspelling5
addpath c:/Data/fingerspellingmat5/
matdatapath = 'c:/Data/fingerspellingmat5/';

%設定圖片和mat檔案儲存的位置
img_root_path = 'c:/Data/fingerspelling5/';
mat_root_path = 'c:/Data/fingerspellingmat5/';

%將圖片歸一化到的尺寸大小
img_scale_width = 96;
img_scale_height = 96;

%% 開始將圖片轉換為mat資料
img_who_path = dir(img_root_path);%dir命令為列出資料夾內檔案的內容
if(img_who_path(1).isdir) %判斷是哪個人操作的，A,B,C,...
    length_img_who_path = length(img_who_path);
    for ii = 4:length_img_who_path %3~7
        % 在次定義儲存中間元素的變數，因為我的電腦有8G記憶體，所以就一次性全部讀完了，如果電腦記憶體不夠的話，最好分開存入這些資料
        %讀取所有RGB影象的訓練部分和測試部分圖片
        color_img_train = zeros(img_scale_width*img_scale_height*3,250*24);
        color_label_train = zeros(250*24,1);
        color_img_test = zeros(img_scale_width*img_scale_height*3,250*24);
        color_label_test = zeros(250*24,1);
        %讀取所有gray影象的訓練部分和測試部分圖片
        gray_img_train = zeros(img_scale_width*img_scale_height,250*24);
        gray_label_train = zeros(250*24,1);
        gray_img_test = zeros(img_scale_width*img_scale_height,250*24);
        gray_label_test = zeros(250*24,1);
        %讀取所有depth影象的訓練部分和測試部分圖片
        depth_img_train = zeros(img_scale_width*img_scale_height,250*24);
        depth_label_train = zeros(250*24,1);
        depth_img_test = zeros(img_scale_width*img_scale_height,250*24);
        depth_label_test = zeros(250*24,1);
        
        img_which_path = dir([img_root_path img_who_path(ii).name '/']);
        if(img_which_path(1).isdir) %判斷是哪個手勢,a,b,c,...
            length_img_which_path = length(img_which_path);
            for jj = 3:length_img_which_path%3~26
                
               %讀取RGB和gray圖片目錄
               color_img_set = dir([img_root_path img_who_path(ii).name '/' ...
                                img_which_path(jj).name '/color_*.png']);%找到A/a.../下的rgb圖片 
               %讀取depth圖片目錄
               depth_img_set = dir([img_root_path img_who_path(ii).name '/' ...
                                img_which_path(jj).name '/depth_*.png']);%找到A/a.../下的depth圖片 
                            
               assert(length(color_img_set) == length(depth_img_set),'the number of color image must agree with the depth image');
               img_num = length(color_img_set);%因為rgb和depth圖片的個數相等
               assert(img_num >= 500, 'the number of rgb color images must greater than 500');                         
               img_father_path = [img_root_path img_who_path(ii).name '/'  img_which_path(jj).name '/'];
               for kk = 1:500
                   color_img_name = [img_father_path color_img_set(kk).name];          
                   depth_img_name = [img_father_path depth_img_set(kk).name];        
                   fprintf('Processing the image: %s and %s\n',color_img_name,depth_img_name);
                   %讀取rgb圖和gray圖，最好是先resize，然後轉換成double
                   color_img = imresize(imread(color_img_name),[96 96]);
                   gray_img = rgb2gray(color_img);
                   color_img = im2double(color_img);                  
                   gray_img = im2double(gray_img);
                   %讀取depth圖
                   depth_img = imresize(imread(depth_img_name),[96 96]);
                   depth_img = im2double(depth_img);                  
                   %將圖片資料寫入陣列中
                   if kk <= 250
                       color_img_train(:,(jj-3)*250+kk) =  color_img(:);
                       color_label_train((jj-3)*250+kk) = jj-2;
                       gray_img_train(:,(jj-3)*250+kk) =  gray_img(:);
                       gray_label_train((jj-3)*250+kk) = jj-2;
                       depth_img_train(:,(jj-3)*250+kk) = depth_img(:);
                       depth_label_train((jj-3)*250+kk) = jj-2;
                   else
                       color_img_test(:,(jj-3)*250+kk-250) = color_img(:);
                       color_label_test((jj-3)*250+kk-250) = jj-2;
                       gray_img_test(:,(jj-3)*250+kk-250) = gray_img(:);
                       gray_label_test((jj-3)*250+kk-250) = jj-2;
                       depth_img_test(:,(jj-3)*250+kk-250) = depth_img(:);
                       depth_label_test((jj-3)*250+kk-250) = jj-2;
                   end
               end              
            end                      
        end
        %儲存圖片
        fprintf('Saving %s\n',[mat_root_path 'color_img_train_' img_who_path(ii).name '.mat']);
        save([mat_root_path 'color_img_train_' img_who_path(ii).name '.mat'], 'color_img_train','color_label_train');
        fprintf('Saving %s\n',[mat_root_path 'color_img_test_' img_who_path(ii).name '.mat']);
        save([mat_root_path 'color_img_test_' img_who_path(ii).name '.mat'] ,'color_img_test', 'color_label_test');
        fprintf('Saving %s\n',[mat_root_path 'gray_img_train_' img_who_path(ii).name '.mat']);
        save([mat_root_path 'gray_img_train_' img_who_path(ii).name '.mat'], 'gray_img_train','gray_label_train');
        fprintf('Saving %s\n',[mat_root_path 'gray_img_test_' img_who_path(ii).name '.mat']);
        save([mat_root_path 'gray_img_test_' img_who_path(ii).name '.mat'] ,'gray_img_test', 'gray_label_test'); 
        fprintf('Saving %s\n',[mat_root_path 'depth_img_train_' img_who_path(ii).name '.mat']);
        save([mat_root_path 'depth_img_train_' img_who_path(ii).name '.mat'], 'depth_img_train','depth_label_train');
        fprintf('Saving %s\n',[mat_root_path 'depth_img_test_' img_who_path(ii).name '.mat']);
        save([mat_root_path 'depth_img_test_' img_who_path(ii).name '.mat'] ,'depth_img_test', 'depth_label_test');        
        
        %清除變數，節省記憶體
        clear color_img_train color_label_train color_img_test color_label_test...
        gray_img_train gray_label_train gray_img_test gray_label_test...
        depth_img_train depth_label_train depth_img_test depth_label_test;
    end
end

sample_patches.m:

function patches = sample_patches(imgset, img_width, img_height, num_perimage, patch_size, channels)
% sample_patches
% imgset: 傳進來的imgset是個矩陣，其中的每一列已經是每張圖片的資料了
% img_width: 傳進來每一列對應的那個圖片的寬度
% img_height: 傳進來每一列對應的那個圖片的高度
% num_perimage: 每張大圖片採集的小patch的個數
% patch_size: 每個patch的大小，這裡統一採用高和寬相等的patch，所以這裡給出的就是其邊長

[n m] = size(imgset); %n為大圖片的維數，m為圖片樣本的個數
num_patches = num_perimage*m; %需要得到的patch的個數

% Initialize patches with zeros.  Your code will fill in this matrix--one
% column per patch, 10000 columns. 
if(channels == 3)
    patches = zeros(patch_size*patch_size*3, num_patches);
else if(channels == 1)
    patches = zeros(patch_size*patch_size, num_patches);
    end
end

assert(n == img_width*img_height*channels, 'The image in the imgset must agree with it width,height anc channels');


%隨機從每張圖片中取出num_perimage張圖片
for imageNum = 1:m%在每張圖片中隨機選取1000個patch，共10000個patch
     img = reshape(imgset(:,imageNum),[img_height img_width channels]);
     for patchNum = 1:num_perimage%實現每張圖片選取num_perimage個patch
        xPos = randi([1,img_height-patch_size+1]);
        yPos = randi([1, img_width-patch_size+1]);
        patch = img(xPos:xPos+patch_size-1,yPos:yPos+patch_size-1,:);
        patches(:,(imageNum-1)*num_perimage+patchNum) = patch(:);
    end
end


 end

patches_preprocessing.m:

% 提取出用於訓練的patches圖片，針對rgb彩色圖
% 打算提取10*10(這個引數當然可以更改，這裡只是默然引數而已)尺寸的patches
% 每張大圖片提取10（這個引數也可以更改）個小的patches
% 返回的引數中有沒有經過白化的patch矩陣patches_without_whiteing.mat，每一列是一個patches
% 也返回經過了ZCAWhitening白化後了的patch矩陣patches_with_whiteing.mat，以及此時的均值向量
% mean_patches，白化矩陣ZCAWhitening

patch_size = 10;
num_per_img = 10;%每張圖片提取出的patches數
num_patches = 100000; %本來有30w個數據的，但是太大了，這裡只取出10w個
epsilon = 0.1; %Whitening時其分母需要用到的引數

% 增加根目錄
addpath c:/Data
addpath c:/Data/fingerspelling5
addpath c:/Data/fingerspellingmat5/
matdatapath = 'c:/Data/fingerspellingmat5/'

% 載入5個人關於color影象的所有資料
fprintf('Downing the color_img_train_A.mat...\n');
load color_img_train_A.mat
fprintf('Sampling the patches from the color_img_train_A set...\n');
patches_A = sample_patches(color_img_train,96,96,10,10,3);%採集所有的patches
clear color_img_train;

fprintf('Downing the color_img_train_B.mat...\n');
load color_img_train_B.mat
fprintf('Sampling the patches from the color_img_train_B set...\n');
patches_B = sample_patches(color_img_train,96,96,10,10,3);%採集所有的patches
clear color_img_train;

fprintf('Downing the color_img_train_C.mat...\n');
load color_img_train_C.mat
fprintf('Sampling the patches from the color_img_train_C set...\n');
patches_C = sample_patches(color_img_train,96,96,10,10,3);%採集所有的patches
clear color_img_train;

fprintf('Downing the color_img_train_D.mat...\n');
load color_img_train_D.mat
fprintf('Sampling the patches from the color_img_train_D set...\n');
patches_D = sample_patches(color_img_train,96,96,10,10,3);%採集所有的patches
clear color_img_train;

fprintf('Downing the color_img_train_E.mat...\n');
load color_img_train_E.mat
fprintf('Sampling the patches from the color_img_train_E set...\n');
patches_E = sample_patches(color_img_train,96,96,10,10,3);%採集所有的patches
clear color_img_train;

%將這些資料組合到一起
patches = [patches_A, patches_B, patches_C, patches_D, patches_E];
size_patches = size(patches);%這裡的size_patches是個2維的向量，並不需要考慮通道方面的事情
rand_patches = randi(size_patches(2), [1 num_patches]); %隨機選取出100000個樣本
patches = patches(:, rand_patches);

%直接儲存原始的patches資料
fprintf('Saving the patches_without_whitening.mat...\n');
save([matdatapath 'patches_without_whitening.mat'], 'patches');

%ZCA Whitening其資料
mean_patches = mean(patches,2); %計算每一維的均值
patches = patches - repmat(mean_patches,[1 num_patches]);%均值化每一維的資料
sigma = (1./num_patches).*patches*patches';

[u s v] = svd(sigma);
ZCAWhitening = u*diag(1./sqrt(diag(s)+epsilon))*u';%ZCAWhitening矩陣，每一維獨立，且方差相等
patches = ZCAWhitening*patches;

%儲存ZCA Whitening後的資料，以及均值列向量，ZCAWhitening矩陣
fprintf('Saving the patches_with_whitening.mat...\n');
save([matdatapath 'patches_with_whitening.mat'], 'patches', 'mean_patches', 'ZCAWhitening');


% %% 後面只是測試下為什麼patches_with_whiteing.mat和patches_without_whiteing.mat大小會相差那麼多
% % 其實雖然說矩陣的大小相同，也都是浮點數，但是由於裡面的內容不同，所以很有可能其佔用的檔案大小不同
% % 單獨存ZCAWhitening
% fprintf('Saving the zca_whiteing.mat...\n');
% save([matdatapath 'zca_whiteing.mat'], 'ZCAWhitening');
% 
% % 單獨存mean_patches
% fprintf('Saving the mean_patches.mat...\n');
% save([matdatapath 'mean_patches.mat'], 'mean_patches');
% 
% aa = ones(300,300000);
% save([matdatapath 'aaones.mat'],'aa');

　　作者：tornadomeet 出處：http://www.cnblogs.com/tornadomeet 歡迎轉載或分享，但請務必宣告文章出處。（新浪微博：tornadomeet,歡迎交流！）

資料預處理練習（深度學習）

資料預處理練習（深度學習）

常用資料預處理技術（python實現）

斯坦福cs231n學習筆記（8）------神經網路訓練細節（資料預處理、權重初始化）

mongodb中文文字資料（新聞評論）預處理程式碼（python+java）

常用的影象資料預處理技術（based on TensorFlow）

深度學習FPGA實現基礎知識6(Deep Learning（深度學習）學習資料大全及CSDN大牛部落格推薦)

《Deep Learning》（深度學習）中文版開發下載

Deep Learning（深度學習）學習筆記整理系列三

Deep Learning（深度學習）學習筆記整理系列四

Deep Learning（深度學習）學習筆記整理系列七

怎樣全面了解和掌握人工智能（深度學習）？

DeepLearning（深度學習）原理與實現

計算圖片的相似度（深度學習）

資料預處理程式碼分享——機器學習與資料探勘

機器學習與計算機視覺（深度學習）

[深度學習] 《Deep Learning》（深度學習）中文版開放下載

深度學習FPGA實現基礎知識5(網友一致認可的----Deep Learning（深度學習）學習筆記整理及完整版下載)

Deep Learning（深度學習）學習筆記整理系列之（七）

DeepLearning（深度學習）原理與實現（一）

Deep Learning（深度學習）之（一）特徵以及訓練方法

資料預處理練習（深度學習）

相關推薦