1. 程式人生 > >預處理後資料的儲存與讀取

預處理後資料的儲存與讀取

在機器學習中,一般都需要先對資料進行資料預處理工作。模型一般需要反覆的調參,因此可能需要多次使用預處理之後的資料,但是反覆進行資料的預處理工作是多餘的,我們可以將其儲存下來。

#用pickle模組將處理好的資料儲存成pickle格式,方便以後呼叫,即建立一個checkpoint
# 儲存資料方便呼叫
import os
import pickle

pickle_file = 'notMNIST.pickle'
if not os.path.isfile(pickle_file):    #判斷是否存在此檔案,若無則儲存
    print('Saving data to pickle file...
') try: with open('fan.pickle', 'wb') as pfile: pickle.dump( { 'X_train': X_train, 'X_test': X_test, 'Ytrain': y_train, 'y_test': y_test, }, pfile, pickle.HIGHEST_PROTOCOL)
except Exception as e: print('Unable to save data to', pickle_file, ':', e) raise print('Data cached in pickle file.')
#從pickle檔案中讀取資料
pickle_file = 'pickle.pickle'
with open(pickle_file, 'rb') as f:
  pickle_data = pickle.load(f)       # 反序列化,與pickle.dump相反
  X_train = pickle_data['
X_train'] X_test = pickle_data['X_test'] y_train = pickle_data['y_train'] y_test = pickle_data['y_test'] del pickle_data # 釋放記憶體 print('Data and modules loaded.')