【深度學習】python用RNN中LSTM進行正弦函式擬合

阿新 • • 發佈：2019-01-22

深度學習框架：Tensorflow 0.8.0
Python：2.7.6

資料的兩種輸入模型：

①data和label是同一個變數，整個模型相當於自迴歸（本文先演示第一種）

②data和label是不同的變數，整個模型相當於尋找data和label的函式關係

生成資料類：

class BatchGenerator:
  def __init__(self,window_size,window_range):
    self.cursor=0.
    self.window=window_size
    self.range=window_range

  def 
 next(self):
    x=np.zeros([self.window,1])
    y=np.zeros([self.window,1])
    d=np.arange(self.cursor,self.cursor+1.,1./self.window)
    for i in range(self.window):

      l=np.sin(d[i])
      x[i,0]=d[i]
      y[i,0]=l
    self.cursor+=1./self.window
    return y[:self.window-1],y[1:],x[:self.window-1 
]

一層的LSTM網路搭建：

with graph.as_default():
    #引數初始化
    #input gate
    ix=tf.Variable(tf.truncated_normal([window_size-1,num_nodes],-0.1,0.1),name="ix")
    ih=tf.Variable(tf.truncated_normal([num_nodes,num_nodes],-0.1,0.1),name="ih")
    ib=tf.Variable(tf.zeros([1,num_nodes]),name="ib")

    #output_gate 

    ox=tf.Variable(tf.truncated_normal([window_size-1,num_nodes],-0.1,0.1),name="ox")
    oh=tf.Variable(tf.truncated_normal([num_nodes,num_nodes],-0.1,0.1),name="oh")
    ob=tf.Variable(tf.zeros([1,num_nodes]),name="ob")

    #forget_gate
    fx=tf.Variable(tf.truncated_normal([window_size-1,num_nodes],-0.1,0.1),name="fx")
    fh=tf.Variable(tf.truncated_normal([num_nodes,num_nodes],-0.1,0.1),name="fh")
    fb=tf.Variable(tf.zeros([1,num_nodes]),name="fb")

    #cell
    gx=tf.Variable(tf.truncated_normal([window_size-1,num_nodes],-0.1,0.1),name="gx")
    gh=tf.Variable(tf.truncated_normal([num_nodes,num_nodes],-0.1,0.1),name="gh")
    gb=tf.Variable(tf.zeros([1,num_nodes]),name="gb")

    #variables saving state across unrollings
    saved_output=tf.Variable(tf.zeros([1,num_nodes]),trainable=False,name="saved_output")
    saved_state=tf.Variable(tf.zeros([1,num_nodes]),trainable=False,name="saved_state")

    #classifier's weights and biases
    w=tf.Variable(tf.truncated_normal([num_nodes,window_size-1],-0.1,0.1),name="w")
    b=tf.Variable(tf.zeros([window_size-1]),name="b")

    ##定義LSTM cell
    def lstm_cell(x,h,c):
        input_gate=tf.sigmoid(tf.matmul(x,ix)+tf.matmul(h,ih)+ib)
        output_gate=tf.sigmoid(tf.matmul(x,ox)+tf.matmul(h,oh)+ob)
        forget_gate=tf.sigmoid(tf.matmul(x,fx)+tf.matmul(h,fh)+fb)
        gt=tf.tanh(tf.matmul(x,gx)+tf.matmul(h,gh)+gb)
        ct=input_gate*gt+forget_gate*c
        return output_gate*tf.tanh(ct),ct

    #Input Data

    train_data=list()
    train_label=list()
    for _ in range(num_unrollings+1):
        train_data.append(\
                tf.placeholder(tf.float32,shape=[1,window_size-1]))
        train_label.append(\
                tf.placeholder(tf.float32,shape=[1,window_size-1]))
    train_inputs=train_data
    train_labels=train_label

    #將LSTM展開
    outputs=list()
    output=saved_output
    state=saved_state
    for i in train_inputs:
        output,state=lstm_cell(i,output,state)
        outputs.append(output)

    #將最後一層LSTM的輸出和隱藏層儲存
    with tf.control_dependencies([saved_output.assign(output),\
                                  saved_state.assign(state)]):
        #一層LSTM的RNN搭建
        logits=tf.nn.xw_plus_b(tf.concat(0,outputs),w,b)
        loss=tf.reduce_mean(\
                tf.reduce_sum(tf.square(tf.concat(0,train_labels)-logits))
            )
    #optimizer
    global_step=tf.Variable(0)
    learning_rate=tf.train.exponential_decay(0.8,global_step,1000,0.5)
    optimizer=tf.train.GradientDescentOptimizer(learning_rate)
    gradients,v=zip(*optimizer.compute_gradients(loss))
    gradients,_=tf.clip_by_global_norm(gradients,1.25)
    optimizer=optimizer.apply_gradients(\
               zip(gradients,v),global_step=global_step)
    #Predictions
    train_prediction=logits

    #test eval:batch 1,no unrolling
    sample_input=tf.placeholder(tf.float32,shape=[1,window_size-1])
    saved_sample_output=tf.Variable(tf.zeros([1,num_nodes]))
    saved_sample_state=tf.Variable(tf.zeros([1,num_nodes])) 
    reset_sample_state=tf.group(\
        saved_sample_output.assign(tf.zeros([1,num_nodes])),\
        saved_sample_state.assign(tf.zeros([1,num_nodes])))
    sample_output,sample_state=lstm_cell(\
                        sample_input,saved_sample_output,saved_sample_state)
    with tf.control_dependencies([saved_sample_output.assign(sample_output),\
                            saved_sample_state.assign(sample_state)]):
        sample_prediction=tf.nn.xw_plus_b(sample_output,w,b)

完整程式碼：

# -*- coding: utf-8 -*-
"""
Created on Fri Apr 14 08:52:47 2017

@author: ZMJ
"""
from __future__ import print_function
import numpy as np
import tensorflow as tf
from six.moves import range
import collections
import random
import csv
from matplotlib import pyplot as plt
class BatchGenerator:
  def __init__(self,window_size,window_range):
    self.cursor=0.
    self.window=window_size
    self.range=window_range

  def next(self):
    x=np.zeros([self.window,1])
    y=np.zeros([self.window,1])
    d=np.arange(self.cursor,self.cursor+1.,1./self.window)
    for i in range(self.window):

      l=np.sin(d[i])
      x[i,0]=d[i]
      y[i,0]=l
    self.cursor+=1./self.window
    return y[:self.window-1],y[1:],x[:self.window-1]

num_nodes=64
window_size=5
window_range=10
num_unrollings=100
graph=tf.Graph()
with graph.as_default():
    #引數初始化
    #input gate
    ix=tf.Variable(tf.truncated_normal([window_size-1,num_nodes],-0.1,0.1),name="ix")
    ih=tf.Variable(tf.truncated_normal([num_nodes,num_nodes],-0.1,0.1),name="ih")
    ib=tf.Variable(tf.zeros([1,num_nodes]),name="ib")

    #output_gate
    ox=tf.Variable(tf.truncated_normal([window_size-1,num_nodes],-0.1,0.1),name="ox")
    oh=tf.Variable(tf.truncated_normal([num_nodes,num_nodes],-0.1,0.1),name="oh")
    ob=tf.Variable(tf.zeros([1,num_nodes]),name="ob")

    #forget_gate
    fx=tf.Variable(tf.truncated_normal([window_size-1,num_nodes],-0.1,0.1),name="fx")
    fh=tf.Variable(tf.truncated_normal([num_nodes,num_nodes],-0.1,0.1),name="fh")
    fb=tf.Variable(tf.zeros([1,num_nodes]),name="fb")

    #cell
    gx=tf.Variable(tf.truncated_normal([window_size-1,num_nodes],-0.1,0.1),name="gx")
    gh=tf.Variable(tf.truncated_normal([num_nodes,num_nodes],-0.1,0.1),name="gh")
    gb=tf.Variable(tf.zeros([1,num_nodes]),name="gb")

    #variables saving state across unrollings
    saved_output=tf.Variable(tf.zeros([1,num_nodes]),trainable=False,name="saved_output")
    saved_state=tf.Variable(tf.zeros([1,num_nodes]),trainable=False,name="saved_state")

    #classifier's weights and biases
    w=tf.Variable(tf.truncated_normal([num_nodes,window_size-1],-0.1,0.1),name="w")
    b=tf.Variable(tf.zeros([window_size-1]),name="b")

    ##定義LSTM cell
    def lstm_cell(x,h,c):
        input_gate=tf.sigmoid(tf.matmul(x,ix)+tf.matmul(h,ih)+ib)
        output_gate=tf.sigmoid(tf.matmul(x,ox)+tf.matmul(h,oh)+ob)
        forget_gate=tf.sigmoid(tf.matmul(x,fx)+tf.matmul(h,fh)+fb)
        gt=tf.tanh(tf.matmul(x,gx)+tf.matmul(h,gh)+gb)
        ct=input_gate*gt+forget_gate*c
        return output_gate*tf.tanh(ct),ct

    #Input Data

    train_data=list()
    train_label=list()
    for _ in range(num_unrollings+1):
        train_data.append(\
                tf.placeholder(tf.float32,shape=[1,window_size-1]))
        train_label.append(\
                tf.placeholder(tf.float32,shape=[1,window_size-1]))
    train_inputs=train_data
    train_labels=train_label

    #將LSTM展開
    outputs=list()
    output=saved_output
    state=saved_state
    for i in train_inputs:
        output,state=lstm_cell(i,output,state)
        outputs.append(output)

    #將最後一層LSTM的輸出和隱藏層儲存
    with tf.control_dependencies([saved_output.assign(output),\
                                  saved_state.assign(state)]):
        #一層LSTM的RNN搭建
        logits=tf.nn.xw_plus_b(tf.concat(0,outputs),w,b)
        loss=tf.reduce_mean(\
                tf.reduce_sum(tf.square(tf.concat(0,train_labels)-logits))
            )
    #optimizer
    global_step=tf.Variable(0)
    learning_rate=tf.train.exponential_decay(0.8,global_step,1000,0.5)
    optimizer=tf.train.GradientDescentOptimizer(learning_rate)
    gradients,v=zip(*optimizer.compute_gradients(loss))
    gradients,_=tf.clip_by_global_norm(gradients,1.25)
    optimizer=optimizer.apply_gradients(\
               zip(gradients,v),global_step=global_step)
    #Predictions
    train_prediction=logits

    #test eval:batch 1,no unrolling
    sample_input=tf.placeholder(tf.float32,shape=[1,window_size-1])
    saved_sample_output=tf.Variable(tf.zeros([1,num_nodes]))
    saved_sample_state=tf.Variable(tf.zeros([1,num_nodes])) 
    reset_sample_state=tf.group(\
        saved_sample_output.assign(tf.zeros([1,num_nodes])),\
        saved_sample_state.assign(tf.zeros([1,num_nodes])))
    sample_output,sample_state=lstm_cell(\
                        sample_input,saved_sample_output,saved_sample_state)
    with tf.control_dependencies([saved_sample_output.assign(sample_output),\
                            saved_sample_state.assign(sample_state)]):
        sample_prediction=tf.nn.xw_plus_b(sample_output,w,b)   

num_steps=10001
summary_frequency=100
f=file("out.csv","a+")
writer=csv.writer(f)
with tf.Session(graph=graph) as session:
  tf.initialize_all_variables().run()
  print("Initialized!!")
  mean_loss=0
  feed_dict=dict()
  count=0
  test=BatchGenerator(window_size,window_range)

  for i in range(num_unrollings+1):
    temp1,temp2,_=test.next()
    feed_dict[train_data[i]]=temp1.reshape([1,window_size-1])
    feed_dict[train_label[i]]=temp2.reshape([1,window_size-1])
  for step in range(num_steps):
    _,l,predictions,lr=session.run([optimizer,loss,train_prediction,learning_rate],feed_dict=feed_dict)
    if step%100==0:
      print(l)
  X=list()
  Label=list()
  Prediction=list()
  for i in range(100):
    data,label,x=test.next()
    feed=data.reshape([1,window_size-1])
    prediction=sample_prediction.eval({sample_input:feed})
    X.append(x)
    Label.append(label)
    Prediction.append(prediction)
  Prediction=np.array(Prediction).reshape([1,100*(window_size-1)])
  Label=np.array(Label).reshape([1,100*(window_size-1)])
  X=np.array(X).reshape([1,100*(window_size-1)])
  print(Label,Prediction)
  plt.plot(X[0],Label[0],color="blue")
  plt.plot(X[0],Prediction[0],color="red")
  plt.show()
  plt.savefig("out.png")
  writer.writerows([data.T[0],label.T[0],prediction[0]])
f.close()

擬合的資料結果：
這裡寫圖片描述

【深度學習】python用RNN中LSTM進行正弦函式擬合

深度學習框架：Tensorflow 0.8.0 Python：2.7.6 資料的兩種輸入模型： ①data和label是同一個變數，整個模型相當於自迴歸（本文先演示第一種） ②data和label是不同的變數，整個模型相

【深度學習】Python實現2層神經網路的誤差反向傳播法學習

前言基於計算圖的反向傳播詳解一篇中，我們通過計算圖的形式詳細介紹了構建神經網路需要的層，我們可以將其視為元件，接下來我們只需要將這些元件組合起來就可以實現誤差反向傳播法。首先我們回顧下神經網路的學習步驟如下：從訓練資料中隨機選擇一部分資料（mini-batch）

【深度學習】Python實現基於數值微分的神經網路的學習

回顧 \quad\quad 在之前的神經網路的學習過程一篇中，我們介紹瞭如何獲取批量資料、損失函式、梯度以及梯度下降

【深度學習】python實現簡單神經網路以及手寫數字識別案例

前言 \quad \qu

【深度學習】Python實現簡單神經網路

Python簡單神經網路環境介紹定義神經網路的框架初始化建立網路節點和連結簡單均勻分佈隨機初始權重正態分佈初始權重編寫查詢函式階段性測試編寫訓練函式

【深度學習】6：RNN遞迴神經網路原理、與MNIST資料集實現數字識別

前言：自己學習研究完CNN卷積神經網路後，很久的一段時間因為要完成自己的畢業設計就把更新部落格給耽擱了。瞎忙了這麼久，還是要把之前留的補上來。因為“種一棵樹最好的時間是在十年前，其次就是現在！” –—-—-—-—-—-—-—-—-—-—-—-—–—-—-—-—

【深度學習】RNN中梯度消失的解決方案（LSTM）

上個部落格裡闡述了梯度消失的原因，同時梯度消失會造成RNN的長時記憶失效。所以在本部落格中，會闡述梯度消失的解決方案：①梯度裁剪（Clipping Gradient）②LSTM（Long Short-T

【深度學習】深度學習中IU、IoU(Intersection over Union)的概念理解以及python程式實現

IoU(Intersection over Union) Intersection over Union是一種測量在特定資料集中檢測相應物體準確度的一個標準。我們可以在很多物體檢測挑戰中，例如PASCAL VOC challenge中看多很多使用該標準的做法。通常我們

【深度學習】CNN的實現以及在手寫數字識別中的應用

回顧 Affine層、Relu層以及SoftmaxWithLoss層實現卷積層和池化層實現上面兩篇部落格，實現了CNN包含的層，下面我們只需要將他們組合起來，搭建進行手寫數字識別的CNN CNN實現我們按上圖CNN的網路結構進行實現

【深度學習】基於im2col的展開Python實現卷積層和池化層

一、回顧上一篇我們介紹了，卷積神經網的卷積計算和池化計算，計算過程中視窗一直在移動，那麼我們如何準確的取到視窗內的元素，並進行正確的計算呢？另外，以上我們只考慮的單個輸入資料，如果是批量資料呢？首先，我們先來看看批量資料，是如何計算的二、批處理在神經網路的

【深度學習】線性迴歸（一）原理及python從0開始實現

文章目錄線性迴歸單個屬性的情況多元線性迴歸廣義線性模型實驗資料集介紹相關連結 Python實現環境編碼

【深度學習】線性迴歸（二）小批量隨機梯度下降及其python實現

文章目錄概述小批量隨機梯度下降解析解和數值解小批量隨機梯度下降 python實現需要的先驗知識程式碼和實驗概述本文

【深度學習】關於pytorch中使用pretrained的模型，對模型進行調整

在pytorch中對model進行調整有多種方法。但是總有些莫名奇妙會報錯的。下面有三種，詳情見部落格一是載入完模型後直接修改，（對於resnet比較適用，對於vgg就不能用了）比如： model.fc = nn.Linear(fc_feature

【深度學習】RNN | GRU | LSTM

目錄： 1、RNN 2、GRU 3、LSTM 一、RNN 1、RNN結構圖如下所示：其中： $a^{(t)} = \boldsymbol{W}h^{t-1} + \boldsymbol{W}_{e}x^{t} + \mathbf{b}$ $h^{t} = f(a^{t})

王小草【深度學習】筆記第七彈--RNN與應用案例：注意力模型與機器翻譯

標籤（空格分隔）：王小草深度學習筆記 1. 注意力模型 1.2 注意力模型概述注意力模型（attention model)是一種用於做影象描述的模型。在筆記6中講過RNN去做影象描述，但是精準度可能差強人意。所以在工業界，人們更喜歡用atten

【深度學習】筆記16 windows下SSD網路在caffe中的配置(CPU版本)【筆記2】

************************************************************************************************************* 檔案說明: windows下SSD網路

【深度學習】在Dog Breed Identification中使用Inception-V4遇到過擬合的問題解決進展

在Plant Seedlings Classification中自己根據谷歌的paper用keras搭建了Inception-V4,但在Plant比賽中因為資料集較簡單，沒有出現過擬合問題，模型成功收斂。但在Dog比賽

【備忘】深度學習實戰專案-利用RNN與LSTM網路原理進行唐詩生成視訊課程

第1章遞迴神經網路原理（RNN）34分鐘4節1-1課程簡介[免費觀看]01:211-2遞迴神經網路（RNN）08:391-3RNN網路細節11:541-4LSTM網路架構12:36第2章RNN手寫字型識別32分鐘3節2-1處理Mnist資料集11:502-2RNN網路模型1

【深度學習】【物聯網】深度解讀：深度學習在IoT大資料和流分析中的應用

作者｜Natalie編輯｜EmilyAI 前線導讀：在物聯網時代，大量的感知器每天都在收集併產生

【深度學習】詞的向量化表示

model ref res font 技術訓練 lin 挖掘 body 如果要一句話概括詞向量的用處，就是提供了一種數學化的方法，把自然語言這種符號信息轉化為向量形式的數字信息。這樣就把自然語言理解的問題要轉化為機器學習的問題。其中最常用的詞向量模型無非是 one-h

【深度學習】python用RNN中LSTM進行正弦函式擬合

相關推薦