Building A Deep Learning Model using Keras

阿新 • • 發佈：2018-12-29

Building A Deep Learning Model using Keras

Deep learning is an increasingly popular subset of machine learning. Deep learning models are built using neural networks. A neural network takes in inputs, which are then processed in hidden layers using weights that are adjusted during training. Then the model spits out a prediction. The weights are adjusted to find patterns in order to make better predictions. The user does not need to specify what patterns to look for — the neural network learns on its own.

Keras is a user-friendly neural network library written in Python. In this tutorial, I will go over two deep learning models using Keras: one for regression and one for classification. We will build a regression model to predict an employee’s wage per hour, and we will build a classification model to predict whether or not a patient has diabetes.

Note: The datasets we will be using are relatively clean, so we will not perform any data preprocessing in order to get our data ready for modeling. Datasets that you will use in future projects may not be so clean — for example, they may have missing values — so you may need to use data preprocessing techniques to alter your datasets to get more accurate results.

Reading in the training data

For our regression deep learning model, the first step is to read in the data we will use as input. For this example, we are using the ‘hourly wages’ dataset. To start, we will use Pandas to read in the data. I will not go into detail on Pandas, but it is a library you should become familiar with if you’re looking to dive further into data science and machine learning.

‘df’ stands for dataframe. Pandas reads in the csv file as a dataframe. The ‘head()’ function will show the first 5 rows of the dataframe so you can check that the data has been read in properly and can take an initial look at how the data is structured.

Import pandas as pd

#read in data using pandastrain_df = pd.read_csv(‘data/hourly_wages_data.csv’)

#check data has been read in properlytrain_df.head()

Split up the dataset into inputs and targets

Next, we need to split up our dataset into inputs (train_X) and our target (train_y). Our input will be every column except ‘wage_per_hour’ because ‘wage_per_hour’ is what we will be attempting to predict. Therefore, ‘wage_per_hour’ will be our target.

We will use pandas ‘drop’ function to drop the column ‘wage_per_hour’ from our dataframe and store it in the variable ‘train_X’. This will be our input.

#create a dataframe with all training data except the target columntrain_X = train_df.drop(columns=['wage_per_hour'])#check that the target variable has been removedtrain_X.head()

We will insert the column ‘wage_per_hour’ into our target variable (train_y).

#create a dataframe with only the target columntrain_y = train_df[['wage_per_hour']]#view dataframetrain_y.head()

Building the model

Next, we have to build the model. Here is the code:

from keras.models import Sequentialfrom keras.layers import Dense

#create modelmodel = Sequential()#get number of columns in training datan_cols = train_X.shape[1]#add model layersmodel.add(Dense(10, activation='relu', input_shape=(n_cols,)))model.add(Dense(10, activation='relu'))model.add(Dense(1))

The model type that we will be using is Sequential. Sequential is the easiest way to build a model in Keras. It allows you to build a model layer by layer. Each layer has weights that correspond to the layer the follows it.

We use the ‘add()’ function to add layers to our model. We will add two layers and an output layer.

‘Dense’ is the layer type. Dense is a standard layer type that works for most cases. In a dense layer, all nodes in the previous layer connect to the nodes in the current layer.

We have 10 nodes in each of our input layers. This number can also be in the hundreds or thousands. Increasing the number of nodes in each layer increases model capacity. I will go into further detail about the effects of increasing model capacity shortly.

‘Activation’ is the activation function for the layer. An activation function allows models to take into account nonlinear relationships. For example, if you are predicting diabetes in patients, going from age 10 to 11 is different than going from age 60–61.

The activation function we will be using is ReLU or Rectified Linear Activation. Although it is two linear pieces, it has been proven to work well in neural networks.

The first layer needs an input shape. The input shape specifies the number of rows and columns in the input. The number of columns in our input is stored in ‘n_cols’. There is nothing after the comma which indicates that there can be any amount of rows.

The last layer is the output layer. It only has one node, which is for our prediction.

Compiling the model

Next, we need to compile our model. Compiling the model takes two parameters: optimizer and loss.

The optimizer controls the learning rate. We will be using ‘adam’ as our optmizer. Adam is generally a good optimizer to use for many cases. The adam optimizer adjusts the learning rate throughout training.

The learning rate determines how fast the optimal weights for the model are calculated. A smaller learning rate may lead to more accurate weights (up to a certain point), but the time it takes to compute the weights will be longer.

For our loss function, we will use ‘mean_squared_error’. It is calculated by taking the average squared difference between the predicted and actual values. It is a popular loss function for regression problems. The closer to 0 this is, the better the model performed.

#compile model using mse as a measure of model performancemodel.compile(optimizer='adam', loss='mean_squared_error')

Training the model

Now we will train our model. To train, we will use the ‘fit()’ function on our model with the following five parameters: training data (train_X), target data (train_y), validation split, the number of epochs and callbacks.

The validation split will randomly split the data into use for training and testing. During training, we will be able to see the validation loss, which give the mean squared error of our model on the validation set. We will set the validation split at 0.2, which means that 20% of the training data we provide in the model will be set aside for testing model performance.

The number of epochs is the number of times the model will cycle through the data. The more epochs we run, the more the model will improve, up to a certain point. After that point, the model will stop improving during each epoch. In addition, the more epochs, the longer the model will take to run. To monitor this, we will use ‘early stopping’.

Early stopping will stop the model from training before the number of epochs is reached if the model stops improving. We will set our early stopping monitor to 3. This means that after 3 epochs in a row in which the model doesn’t improve, training will stop. Sometimes, the validation loss can stop improving then improve in the next epoch, but after 3 epochs in which the validation loss doesn’t improve, it usually won’t improve again.

from keras.callbacks import EarlyStopping

#set early stopping monitor so the model stops training when it won't improve anymoreearly_stopping_monitor = EarlyStopping(patience=3)

#train modelmodel.fit(train_X, train_y, validation_split=0.2, epochs=30, callbacks=[early_stopping_monitor])

Making predictions on new data

If you want to use this model to make predictions on new data, we would use the ‘predict()’ function, passing in our new data. The output would be ‘wage_per_hour’ predictions.

#example on how to use our newly trained model on how to make predictions on unseen data (we will pretend our new data is saved in a dataframe called 'test_X').

test_y_predictions = model.predict(test_X)

Congrats! You have built a deep learning model in Keras! It is not very accurate yet, but that can improve with using a larger amount of training data and ‘model capacity’.

Model capacity

As you increase the number of nodes and layers in a model, the model capacity increases. Increasing model capacity can lead to a more accurate model, up to a certain point, at which the model will stop improving. Generally, the more training data you provide, the larger the model should be. We are only using a tiny amount of data, so our model is pretty small. The larger the model, the more computational capacity it requires and it will take longer to train.

Let’s create a new model using the same training data as our previous model. This time, we will add a layer and increase the nodes in each layer to 200. We will train the model to see if increasing the model capacity will improve our validation score.

#training a new model on the same data to show the effect of increasing model capacity#create modelmodel_mc = Sequential()#add model layersmodel_mc.add(Dense(200, activation='relu', input_shape=(n_cols,)))model_mc.add(Dense(200, activation='relu'))model_mc.add(Dense(200, activation='relu'))model_mc.add(Dense(1))#compile model using mse as a measure of model performancemodel_mc.compile(optimizer='adam', loss='mean_squared_error')

#train modelmodel_mc.fit(train_X, train_y, validation_split=0.2, epochs=30, callbacks=[early_stopping_monitor])

We can see that by increasing our model capacity, we have improved our validation loss from 32.63 in our old model to 28.06 in our new model.

Building A Deep Learning Model using Keras

Building A Deep Learning Model using KerasDeep learning is an increasingly popular subset of machine learning. Deep learning models are built using neural

Building a Machine Learning Model through Trial and Error

The machine learning roadmap is filled with trial and error. Engineers and scientists, who are novices at the concept, will constantly tweak and alter thei

The AI Paradox: How A Deep Learning Startup Is Building Successful AI Solutions

We have a paradox staring us in the face. All that web content creates a great forum for philosophical debate: Will AI save the world or bring about the ex

A Deep Learning-Based System for Vulnerability Detection(二)

　　接著上一篇，這篇研究實驗和結果。 A.用於評估漏洞檢測系統的指標 TP：為正確檢測到漏洞的樣本數量 FP：為檢測到虛假漏洞樣本的數量(誤報) FN：為未檢真實漏洞的樣本數量(漏報) TN：未檢測到漏洞樣本的數量　　這篇文獻廣泛使用指標假陽性率(FPR),假陰性率(FNR),真陽性率或者召回率

A Deep Learning Based DDoS Detection System in Software-Defined Networking (SDN)

標題：基於深度學習的軟體定義網路（SDN）DDoS檢測系統來源：Security and Safety 時間：2016年11月摘要分散式拒絕服務（DDoS）是當今組織網路基礎架構遇到的最普遍的攻擊之一。本文在軟體定義網路（SDN）環境中提出了基於深度學習的多向量DDoS檢測系統。 SDN提供了針對

SenseGen: A Deep Learning Architecture for Synthetic Sensor Data Generation論文解讀

一、論文概述 SenseGen這篇論文是17年發表在PerCom Workshops上的一篇論文，來自加州大學洛杉磯分校（University of California at Los Aneles，UCLA）網路與嵌入式系統實驗室（Netoworked & Embedded Syste

Deploying a Machine Learning Model as a REST API

As a Python developer and data scientist, I have a desire to build web apps to showcase my work. As much as I like to design and code the front-end, i

【論文閱讀】A Correlated Topic Model Using Word Embeddings

《A Correlated Topic Model Using Word Embeddings》 Abstract 傳統的主題模型能夠通過用邏輯正態分佈代替先驗的Dirichlet來捕捉潛在主題之間的相關結構。word embeddings 已經被證明能夠捕捉語義規律，因此語義相

How to build a Deep Learning Image Classifier for Game of Thrones dragons

Performance of most flavors of the old generations of learning algorithms will plateau. Deep learning, training large neural networks, is scalable and perf

How to Develop a Deep Learning Photo Caption Generator from Scratch

Tweet Share Share Google Plus Develop a Deep Learning Model to Automatically Describe Photograph

Google Releases Pixel 2 Portrait Mode Deep Learning Model

Many of Google’s machine learning efforts are open-sourced so that developers can take advantage of the latest advancements. The latest release is for sema

【Person Re-ID】Margin Sample Mining Loss: A Deep Learning Based Method for Person Re-identification

Introduction Person Re-ID目前依然是一項十分具有挑戰的任務。姿勢，視角，光照，背景和遮擋都給這項任務帶來困難。傳統的方法通過學習low-level特徵，比如顏色、外形、區域性描述子等來描述一個人。而CNN通過學習high-lev

【論文筆記】Margin Sample Mining Loss: A Deep Learning Based Method for Person Re-identification

摘要 Person re-identification (ReID) is an important task in computer vision. Recently, deep learning with a metric learning loss has becom

Automatic Speech Recognition: A Deep Learning Approach (Signals and Communication Technology): Dong Yu, Li Deng: 9781447157786:

This book provides a comprehensive overview of the recent advancement in the field of automatic speech recognition with a focus on deep learn

Building A Deep Learning Model using Keras

Building A Deep Learning Model using Keras

Reading in the training data

Split up the dataset into inputs and targets

Building the model

Compiling the model

Training the model

Making predictions on new data

Model capacity

Building A Deep Learning Model using Keras

Building a Machine Learning Model through Trial and Error

The AI Paradox: How A Deep Learning Startup Is Building Successful AI Solutions

A Deep Learning-Based System for Vulnerability Detection(二)

A Deep Learning Based DDoS Detection System in Software-Defined Networking (SDN)

SenseGen: A Deep Learning Architecture for Synthetic Sensor Data Generation論文解讀

Deploying a Machine Learning Model as a REST API

【論文閱讀】A Correlated Topic Model Using Word Embeddings

How to build a Deep Learning Image Classifier for Game of Thrones dragons

How to Develop a Deep Learning Photo Caption Generator from Scratch

Google Releases Pixel 2 Portrait Mode Deep Learning Model

【Person Re-ID】Margin Sample Mining Loss: A Deep Learning Based Method for Person Re-identification

【論文筆記】Margin Sample Mining Loss: A Deep Learning Based Method for Person Re-identification

Automatic Speech Recognition: A Deep Learning Approach (Signals and Communication Technology): Dong Yu, Li Deng: 9781447157786:

Accelerating the deep learning model

論文閱讀 | DeepDrawing: A Deep Learning Approach to Graph Drawing

Building a Keras + deep learning REST API（三部曲之一）

論文筆記12:Building Adaptive Tutoring Model using Artificial Neural Networks and Reinforcement Learning

Creating a Modern OCR Pipeline Using Computer Vision and Deep Learning

Build a Deep Understanding of Machine Learning Tools Using Small Targeted Projects

Building A Deep Learning Model using Keras

Building A Deep Learning Model using Keras

Reading in the training data

Split up the dataset into inputs and targets

Building the model

Compiling the model

Training the model

Making predictions on new data

Model capacity

相關推薦