1. 程式人生 > >優達學城-神經網路之預測共享單車使用情況 程式碼分析

優達學城-神經網路之預測共享單車使用情況 程式碼分析

優達學城-神經網路之預測共享單車使用情況 程式碼分析

標籤(): 機器學習


程式碼來自於優達學城深度學習納米學位課程的第一個專案
https://cn.udacity.com/course/deep-learning-nanodegree-foundation–nd101-cn

通過這個專案可以從單車的近兩年使用資料用神經網路預測以後的共享單車是使用情況

預先準備配置環境參照優達學城提供的教程
https://classroom.udacity.com/nanodegrees/nd101-cn/parts/e7f2a11a-4da3-4deb-8d5c-635907a09460/modules/b710d7cd-83a7-48c5-8b43-63beebd97369/lessons/4c03fd28-20ca-40e6-89cc-e72ae96141c2/project

現在開始我們的專案

一.匯入需要用到的庫

%matplotlib inline
%config InlineBackend.figure_format = 'retina'
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

二.匯入優達提供的單車資料

data_path = 'Bike-Sharing-Dataset/hour.csv'

rides = pd.read_csv(data_path)

三.檢視資料

rides.head()

資料簡介

此資料集包含的是從 2011 年 1 月 1 日到 2012 年 12 月 31 日期間每天每小時的騎車人數。騎車使用者分成臨時使用者和註冊使用者,cnt 列是騎車使用者數彙總列。你可以在上方看到前幾行資料。

下圖展示的是資料集中前 10 天左右的騎車人數(某些天不一定是 24 個條目,所以不是精確的 10 天)。你可以在這裡看到每小時租金。這些資料很複雜!週末的騎行人數少些,工作日上下班期間是騎行高峰期。我們還可以從上方的資料中看到溫度、溼度和風速資訊,所有這些資訊都會影響騎行人數。你需要用你的模型展示所有這些資料。

四.資料繪製

rides[:24*10].plot(x='dteday', y='cnt')

繪製了10天內每個時段的騎行總量,並畫圖表示

五.虛擬變數(啞變數)

下面是一些分類變數,例如季節、天氣、月份。要在我們的模型中包含這些資料,我們需要建立二進位制虛擬變數。用 Pandas 庫中的 get_dummies()

就可以輕鬆實現。

dummy_fields = ['season', 'weathersit', 'mnth', 'hr', 'weekday']
for each in dummy_fields:
    dummies = pd.get_dummies(rides[each], prefix=each, drop_first=False)
    rides = pd.concat([rides, dummies], axis=1)

fields_to_drop = ['instant', 'dteday', 'season', 'weathersit', 
                  'weekday', 'atemp', 'mnth', 'workingday', 'hr']
data = rides.drop(fields_to_drop, axis=1)
data.head()

六.調整目標變數

為了更輕鬆地訓練網路,我們將對每個連續變數標準化,即轉換和調整變數,使它們的均值為 0,標準差為 1。

我們會儲存換算因子,以便當我們使用網路進行預測時可以還原資料。

quant_features = ['casual', 'registered', 'cnt', 'temp', 'hum', 'windspeed']
# Store scalings in a dictionary so we can convert back later
scaled_features = {}
for each in quant_features:
    mean, std = data[each].mean(), data[each].std()
    scaled_features[each] = [mean, std]
    data.loc[:, each] = (data[each] - mean)/std

七.將資料拆分為訓練、測試和驗證資料集

我們將大約最後 21 天的資料儲存為測試資料集,這些資料集會在訓練完網路後使用。我們將使用該資料集進行預測,並與實際的騎行人數進行對比。

# Save data for approximately the last 21 days 
test_data = data[-21*24:]

# Now remove the test data from the data set 
data = data[:-21*24]

# Separate the data into features and targets
target_fields = ['cnt', 'casual', 'registered']
features, targets = data.drop(target_fields, axis=1), data[target_fields]
test_features, test_targets = test_data.drop(target_fields, axis=1), test_data[target_fields]

八.開始構建神經網路



class NeuralNetwork(object):
    def __init__(self, input_nodes, hidden_nodes, output_nodes, learning_rate):
        # Set number of nodes in input, hidden and output layers.
        self.input_nodes = input_nodes
        self.hidden_nodes = hidden_nodes
        self.output_nodes = output_nodes

        # Initialize weights
        self.weights_input_to_hidden = np.random.normal(0.0, self.input_nodes**-0.5, 
                                       (self.input_nodes, self.hidden_nodes))

        self.weights_hidden_to_output = np.random.normal(0.0, self.hidden_nodes**-0.5, 
                                       (self.hidden_nodes, self.output_nodes))
        self.lr = learning_rate

        #### TODO: Set self.activation_function to your implemented sigmoid function ####
        #
        # Note: in Python, you can define a function with a lambda expression,
        # as shown below.
        self.activation_function = lambda x :  1 / ( 1 + np.exp(-x) ) # Replace 0 with your sigmoid calculation.

        ### If the lambda code above is not something you're familiar with,
        # You can uncomment out the following three lines and put your 
        # implementation there instead.
        #
        #def sigmoid(x):
        #    return 0  # Replace 0 with your sigmoid calculation here
        #self.activation_function = sigmoid


    def train(self, features, targets):
        ''' Train the network on batch of features and targets. 

            Arguments
            ---------

            features: 2D array, each row is one data record, each column is a feature
            targets: 1D array of target values

        '''
        n_records = features.shape[0]
        delta_weights_i_h = np.zeros(self.weights_input_to_hidden.shape)
        delta_weights_h_o = np.zeros(self.weights_hidden_to_output.shape)
        for X, y in zip(features, targets):
            #### Implement the forward pass here ####
            ### Forward pass ###
            # TODO: Hidden layer - Replace these values with your calculations.
            hidden_inputs = np.dot(X,weights_input_to_hidden) # signals into hidden layer
            hidden_outputs = self.activation_function( hidden_inputs ) # signals from hidden layer

            # TODO: Output layer - Replace these values with your calculations.
            final_inputs = np.dot(hidden_outputs,weights_hidden_to_output) # signals into final output layer
            final_outputs = final_inputs # signals from final output layer

            #### Implement the backward pass here ####
            ### Backward pass ###

            # TODO: Output error - Replace this value with your calculations.
            error = y - final_outputs 
            # Output layer error is the difference between desired target and actual output.

            # TODO: Calculate the hidden layer's contribution to the error
            hidden_error = np.dot( self.weights_hidden_to_output, output_error_term )

            # TODO: Backpropagated error terms - Replace these values with your calculations.
            output_error_term = error
            hidden_error_term = hidden_error * hidden_outputs * (1 - hidden_outputs)

            # Weight step (input to hidden)
            delta_weights_i_h += hidden_error_term * X[:,None]
            # Weight step (hidden to output)
            delta_weights_h_o += output_error_term * hidden_outputs[:,None]

        # TODO: Update the weights - Replace these values with your calculations.
        self.weights_hidden_to_output += self.lr * delta_weights_i_h/n_record # update hidden-to-output weights with gradient descent step
        self.weights_input_to_hidden += self.lr * delta_weights_h_o/n_record # update input-to-hidden weights with gradient descent step

    def run(self, features):
        ''' Run a forward pass through the network with input features 

            Arguments
            ---------
            features: 1D array of feature values
        '''

        #### Implement the forward pass here ####
        # TODO: Hidden layer - replace these values with the appropriate calculations.
        hidden_inputs = np.dot( features, self.weights_input_to_hidden ) # signals into hidden layer
        hidden_outputs = self.activation_function( hidden_inputs ) # signals from hidden layer

        # TODO: Output layer - Replace these values with the appropriate calculations.
        final_inputs = np.dot( hidden_output, self.weights_hidden_to_output ) # signals into final output layer
        final_outputs = final_inputs # signals from final output layer 

        return final_outputs

該段需要新增的程式碼:
1.
self.activation_function = lambda x : 1 / ( 1 + np.exp(-x) )

該程式碼表示了啟用函式為S型函式:

g ( z ) = 1 1 + e z

該函式的函式影象如下

S型函式

在預測時輸入x變數所得的g(z)即結果為1的概率值

2.用輸入層乘以權重矩陣獲得隱藏層的資料
然後把隱藏層的資料用啟用函式(S型函式)進行轉化

hidden_inputs = np.dot(X,weights_input_to_hidden) # signals into hidden layer
hidden_outputs = self.activation_function( hidden_inputs ) # signals from hidden layer

3.隱藏層的資料乘以權重矩陣得到輸出層的資料
輸出層的資料不需要用啟用函式進行轉化

final_inputs = np.dot(hidden_outputs,weights_hidden_to_output) # signals into final output layer
final_outputs = final_inputs # signals from final output layer

4.誤差為真實值與預測值的差值
反向傳播的誤差就是真實誤差

error = y - final_outputs 
output_error_term = error

5.用反向傳播計算隱藏層的誤差

hidden_error = np.dot( self.weights_hidden_to_output, output_error_term )
hidden_error_term = hidden_error * hidden_outputs * (1 - hidden_outputs)

6.將每一項的輸出層與隱藏層誤差進行累加

delta_weights_i_h += hidden_error_term * X[:,None]
delta_weights_h_o += output_error_term * hidden_outputs[:,None]

7.對權重進行更新

self.weights_hidden_to_output += self.lr * delta_weights_i_h/n_record 
self.weights_input_to_hidden += self.lr * delta_weights_h_o/n_record

8.用前向傳播對資料進行計算

hidden_inputs = np.dot( features, self.weights_input_to_hidden ) 
hidden_outputs = self.activation_function( hidden_inputs ) 
final_inputs = np.dot( hidden_output, self.weights_hidden_to_output )
final_outputs = final_inputs # signals from final output layer 

九.計算平方誤差

def MSE(y, Y):
return np.mean((y-Y)**2)

十.將最後六十天作為驗證集

# Hold out the last 60 days or so of the remaining data as a validation set
train_features, train_targets = features[:-60*24], targets[:-60*24]
val_features, val_targets = features[-60*24:], targets[-60*24:]

十一.單元測試

import unittest

inputs = np.array([[0.5, -0.2, 0.1]])
targets = np.array([[0.4]])
test_w_i_h = np.array([[0.1, -0.2],
                       [0.4, 0.5],
                       [-0.3, 0.2]])
test_w_h_o = np.array([[0.3],
                       [-0.1]])

class TestMethods(unittest.TestCase):

    ##########
    # Unit tests for data loading
    ##########

    def test_data_path(self):
        # Test that file path to dataset has been unaltered
        self.assertTrue(data_path.lower() == 'bike-sharing-dataset/hour.csv')

    def test_data_loaded(self):
        # Test that data frame loaded
        self.assertTrue(isinstance(rides, pd.DataFrame))

    ##########
    # Unit tests for network functionality
    ##########

    def test_activation(self):
        network = NeuralNetwork(3, 2, 1, 0.5)
        # Test that the activation function is a sigmoid
        self.assertTrue(np.all(network.activation_function(0.5) == 1/(1+np.exp(-0.5))))

    def test_train(self):
        # Test that weights are updated correctly on training
        network = NeuralNetwork(3, 2, 1, 0.5)
        network.weights_input_to_hidden = test_w_i_h.copy()
        network.weights_hidden_to_output = test_w_h_o.copy()

        network.train(inputs, targets)

        self.assertTrue(np.allclose(network.weights_hidden_to_output, 
                                    np.array([[ 0.37275328], 
                                              [-0.03172939]])))
       # print(network.weights_input_to_hidden)
        self.assertTrue(np.allclose(network.weights_input_to_hidden,
                                    np.array([[ 0.10562014, -0.20185996], 
                                              [0.39775194, 0.50074398], 
                                              [-0.29887597, 0.19962801]])))

    def test_run(self):
        # Test correctness of run method
        network = NeuralNetwork(3, 2, 1, 0.5)
        network.weights_input_to_hidden = test_w_i_h.copy()
        network.weights_hidden_to_output = test_w_h_o.copy()

        self.assertTrue(np.allclose(network.run(inputs), 0.09998924))

suite = unittest.TestLoader().loadTestsFromModule(TestMethods())
unittest.TextTestRunner().run(suite)

十二.通過調參訓練網路

import sys

### Set the hyperparameters here ###
iterations = 1000
learning_rate = 0.5
hidden_nodes = 10
output_nodes = 1

N_i = train_features.shape[1]
network = NeuralNetwork(N_i, hidden_nodes, output_nodes, learning_rate)

losses = {'train':[], 'validation':[]}
for ii in range(iterations):
    # Go through a random batch of 128 records from the training data set  
    batch = np.random.choice(train_features.index, size=128)
    X, y = train_features.iloc[batch].values, train_targets.iloc[batch]['cnt']

    network.train(X, y)

    # Printing out the training progress
    train_loss = MSE(network.run(train_features).T, train_targets['cnt'].values)
    val_loss = MSE(network.run(val_features).T, val_targets['cnt'].values)

    sys.stdout.write("\rProgress: {:2.1f}".format(100 * ii/float(iterations)) \
                     + "% ... Training loss: " + str(train_loss)[:5] \
                     + " ... Validation loss: " + str(val_loss)[:5])
    sys.stdout.flush()

    losses['train'].append(train_loss)
    losses['validation'].append(val_loss)

十三.檢查預測結果


fig, ax = plt.subplots(figsize=(8,4))

mean, std = scaled_features['cnt']
predictions = network.run(test_features).T*std + mean
ax.plot(predictions[0], label='Prediction')
ax.plot((test_targets['cnt']*std + mean).values, label='Data')
ax.set_xlim(right=len(predictions))
ax.legend()

dates = pd.to_datetime(rides.loc[test_data.index]['dteday'])
dates = dates.apply(lambda d: d.strftime('%b %d'))
ax.set_xticks(np.arange(len(dates))[12::24])
_ = ax.set_xticklabels(dates[12::24], rotation=45)