用LSTM做時間序列預測的思路,tensorflow程式碼實現及傳入資料格式
首先推薦一個對LSTM一些類函式進行說明的部落格: 函式說明
我的目標是用LSTM進行某種水果價格的預測,一開始我的做法是,將一種水果前n天的價格作為變數傳入,即這樣傳入的DataFrame格式是有n+1列,結果訓練出來的效果不盡人意,完全比不上之前我用ARIMA時間序列去擬合價格曲線.
之後繼續瀏覽了很多部落格,資料什麼的,終於明白了一個引數:time_step的意義,LSTM,長短時訓練網路,time_step這個引數才是體現其記憶的地方,比如說我要用前一百天的價格求當天的價格,time_step需要為100,且需要建立在資料是連續的基礎上.而且其中還遇到了一個坑,就是傳入資料的格式問題.程式碼過後來講.
我的程式碼是基於 部落格 這個部落格的程式碼修改的,主要修改的地方就是格式的問題,這個問題在我一開始傳入的X有很多列的時候是不會存在的,但在我單純用價格來預測價格的時候,就出現問題了,如果直接按原來一樣處理,會報錯:
ValueError: Cannot feed value of shape (1,) for Tensor 'train/Placeholder:0', which has shape '(?, 100, 1)'
類似這樣的錯誤,都是shape的錯誤引起的,一般來講,tf.nn.dynamic_rnn()的inputs的輸入格式大概是[batch_size,time_steps_size,input_size],time_steps_size便是要考慮的天數,input_size指輸入的變數數batch_size是塊大小,由資料量決定.在這份程式碼中,訓練資料的獲取沒毛病,問題在於測試資料,比如說,我訓練資料對應600條,測試資料200條.不過那是給[batch_size,time_steps_size,input_size],train_x可能是[590,10,10],train_y為[590,10,1],而test_x為[20,10,10],test_y為一個數組,主要是為了計算準確率,注意了,test_x的input_size大於1時代在處理時候比較簡單,Dataframe的iloc().values獲取後得到的numpy庫的ndarray,比如
data[[1,2,3],[2,3,4],[3,4,5]] # 為numpy.ndarray()格式 x = data[i * time_step:(i + 1) * time_step, 1:101] # 獲取多列的時候,返回多個數組 x = data[i * time_step:(i + 1) * time_step, 1] # 這個時候返回一個數組
具體的話可以print出來看看,最主要的shape問題就出現在這裡
# -*- coding: utf-8 -*- # @Time : 18-10-19 import pandas as pd import numpy as np import matplotlib.pyplot as plt import tensorflow as tf import time import os os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' pd.set_option('max_colwidth', 5000) pd.set_option('display.max_columns', 10) pd.set_option('display.max_rows', 1000) # 定義常量 rnn_unit = 10 # hidden layer units input_size = 1 # 輸入1個變數 output_size = 1 # 輸出1個變數 lr = 0.0006 # 學習率 # ——————————————————匯入資料—————————————————————— f = open("/home/user_name/下載/vegetable_data/veg/xxx.csv") df = pd.read_csv(f) # 讀入資料 df = pd.concat([df['price'], df['pre_price_100']], axis=1) print(df) data = df.iloc[:800, :].values # 一共兩列 # 獲取訓練集 def get_train_data(batch_size=60, time_step=10, train_begin=0, train_end=600): """ 得到的train_x的格式為shape[batch_size,time_step,輸入變數數] 得到的train_y的格式為shape[batch_size,time_step,輸出變數數] batch_size*time_step=資料量 :param batch_size: :param time_step: :param train_begin: :param train_end: :return: """ batch_index = [] data_train = data[train_begin:train_end] normalized_train_data = data_train train_x, train_y = [], [] # 訓練集 for i in range(len(normalized_train_data) - time_step): if i % batch_size == 0: batch_index.append(i) x = normalized_train_data[i:i + time_step, 1, np.newaxis] y = normalized_train_data[i:i + time_step, 0, np.newaxis] train_x.append(x.tolist()) train_y.append(y.tolist()) batch_index.append((len(normalized_train_data) - time_step)) return batch_index, train_x, train_y # 獲取測試集 def get_test_data(time_step=10, test_begin=600): # 輸出的y_test為一個數組,長度為資料量的長度 data_test = data[test_begin:] mean = np.mean(data_test, axis=0) std = np.std(data_test, axis=0) # normalized_test_data = (data_test - mean) / std # 標準化,但我沒采用 normalized_test_data = data_test size = (len(normalized_test_data) + time_step - 1) // time_step # 有size個sample # print(size) test_x, test_y = [], [] for i in range(size - 1): x = normalized_test_data[i * time_step:(i + 1) * time_step, 1, np.newaxis] y = normalized_test_data[i * time_step:(i + 1) * time_step, 0] test_x.append(x) test_y.extend(y) x = (normalized_test_data[(i + 1) * time_step:, 1, np.newaxis]) test_x.append(x) test_y.extend((normalized_test_data[(i + 1) * time_step:, 0]).tolist()) return mean, std, test_x, test_y # ——————————————————定義神經網路變數—————————————————— # 輸入層、輸出層權重、偏置 weights = { 'in': tf.Variable(tf.random_normal([input_size, rnn_unit])), 'out': tf.Variable(tf.random_normal([rnn_unit, 1])) } biases = { 'in': tf.Variable(tf.constant(0.1, shape=[rnn_unit, ])), 'out': tf.Variable(tf.constant(0.1, shape=[1, ])) } # ——————————————————定義神經網路變數—————————————————— def lstm(X): batch_size = tf.shape(X)[0] time_step = tf.shape(X)[1] w_in = weights['in'] b_in = biases['in'] # -1表示第一層靠第二層來決定 input = tf.reshape(X, [-1, input_size]) # 需要將tensor轉成2維進行計算,計算後的結果作為隱藏層的輸入 input_rnn = tf.matmul(input, w_in) + b_in input_rnn = tf.reshape(input_rnn, [-1, time_step, rnn_unit]) # 將tensor轉成3維,作為lstm cell的輸入 cell = tf.nn.rnn_cell.BasicLSTMCell(rnn_unit) init_state = cell.zero_state(batch_size, dtype=tf.float32) # output_rnn是記錄lstm每個輸出節點的結果,final_states是最後一個cell的結果 output_rnn, final_states = tf.nn.dynamic_rnn(cell, input_rnn, initial_state=init_state, dtype=tf.float32) output = tf.reshape(output_rnn, [-1, rnn_unit]) # 作為輸出層的輸入 w_out = weights['out'] b_out = biases['out'] pred = tf.matmul(output, w_out) + b_out print(pred, final_states) return pred, final_states # ——————————————————訓練模型—————————————————— def train_lstm(batch_size=80, time_step=100, train_begin=0, train_end=500): X = tf.placeholder(tf.float32, shape=[None, time_step, input_size]) Y = tf.placeholder(tf.float32, shape=[None, time_step, output_size]) batch_index, train_x, train_y = get_train_data(batch_size, time_step, train_begin, train_end) mean, std, test_x, test_y = get_test_data(time_step) # test_y為list pred, _ = lstm(X) # 損失函式 loss = tf.reduce_mean(tf.square(tf.reshape(pred, [-1]) - tf.reshape(Y, [-1]))) train_op = tf.train.AdamOptimizer(lr).minimize(loss) saver = tf.train.Saver(tf.global_variables(), max_to_keep=1) module_file = tf.train.latest_checkpoint('/home/lin/PycharmProjects/lins/vegetable/model/') with tf.Session() as sess: sess.run(tf.global_variables_initializer()) try: saver.restore(sess, module_file) except Exception as error: pass # 模型的恢復用的是restore()函式,它需要兩個引數restore(sess, save_path), # save_path指的是儲存的模型路徑。我們可以使用tf.train.latest_checkpoint()來自動獲取最後一次儲存的模型 # 重複訓練10000次 for i in range(1001): for step in range(len(batch_index) - 1): _, loss_ = sess.run([train_op, loss], feed_dict={X: train_x[batch_index[step]:batch_index[step + 1]], Y: train_y[batch_index[step]:batch_index[step + 1]]}) if i % 50 == 0: print(i, loss_) test_predict = [] for step in range(len(test_x)): prob = sess.run(pred, feed_dict={X: [test_x[step]]}) predict = prob.reshape((-1)) test_predict.extend(predict) # 接下來輸出預測資料和原資料的對比以及預測誤差在10%,5%,1%的準確率 print(test_predict) print(test_y) boolen_list = [abs(test_y[i] - test_predict[i]) / test_predict[i] < 0.1 for i in range(len(test_y))] num_list = (tf.cast(boolen_list, tf.float32)) accuracy = tf.reduce_mean(num_list) print(sess.run(accuracy)) boolen_list = [abs(test_y[i] - test_predict[i]) / test_predict[i] < 0.05 for i in range(len(test_y))] num_list = (tf.cast(boolen_list, tf.float32)) accuracy = tf.reduce_mean(num_list) print(sess.run(accuracy)) boolen_list = [abs(test_y[i] - test_predict[i]) / test_predict[i] < 0.01 for i in range(len(test_y))] num_list = (tf.cast(boolen_list, tf.float32)) accuracy = tf.reduce_mean(num_list) print(sess.run(accuracy)) with tf.variable_scope('train'): train_lstm()
因此在資料的處理過程加入了對test_x的新增一個np.newaxis,對應成了[20,10,1],不然的話,得到的test_x shape則為[20,10]