1. 程式人生 > >How to train your Neural Networks in parallel with Keras and Apache Spark

How to train your Neural Networks in parallel with Keras and Apache Spark

Apache Spark on IBM Watson Studio

Now, we will finally train our Keras model using the experimental Keras2DML API. To be able to execute the following code, you will need to make a free tier account on IBM cloud account and log-in to activate Watson studio.

(step-by-step Spark setup on IBM cloud tutorial here

, more information on spark with IBM cloud here).

Once you have a Watson studio account with an active Spark plan, you can create a Jupyter notebook on the platform, choose a cloud machine configuration (number of CPUs and RAM) and a Spark plan, and get started!

Watson studio comes with a free spark plan, including 2 Spark workers. While this is enough for demonstration purposes such as now, for real world scenarios it is highly advised to get a paying Spark plan. More Spark workers basically means more threads with which computation may be paralleled, hence less zombie-like waiting in-front of your screen for results. Finally before we get started, I will also note that other alternatives such as

Deep Cognition, with equally interesting features and illustrative Medium articles , exist, and are as worthy of exploration.

Hand-Written digit recognition on SystemML

Ah, MNIST. So iconic, it could be considered the ‘hello world’ of machine learning datasets. In fact, it is even one of the six standard datasets

which comes with a Keras install. And rest assured, whatever algorithm you have in mind, ranging from linear classifiers to convolutional neural nets, has been tried and tested on this dataset, sometime in the past 20 years . All for the task of handwritten digit recognition. Something we humans do so effortlessly ourselves(so much so, that having to do it as a job must surely be arduously depressing).

In fact, this task was the ideal candidate for quite a few machine learning genesis projects , due to lack of comprehensively large datasets that existed at the time for…well, anything really. Although that is no more the case, and the Internet is flooding with datasets ranging from avocado prices, to volcanoes on Venus . Today, we honor the MNIST tradition by up-scaling our handwritten digit recognition project. We do this by training our machine learning algorithm on a computational cluster, and potentially decrease our training time dramatically in doing so.

Convolutional neural network on MNIST dataset

1. We start by importing some of the libraries :

import kerasfrom keras.models import Sequentialfrom keras.layers import Input, Dense, Conv2Dfrom keras.layers import MaxPooling2D, Dropout,Flattenfrom keras import backend as Kfrom keras.models import Modelimport numpy as npimport matplotlib.pyplot as plt

2. Load the data

We can now load in the MNIST dataset from Keras, using this simple line of code below.

from keras.datasets import mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# Expect to see a numpy n-dimentional array of (60000, 28, 28)
type(X_train), X_train.shape, type(X_train)

3. Shape your data

Here, we do some reshaping most appropriate for our neural network . We rearrange each 28 X 28 image into one vector of 784 pixel values.

#Flatten each of our 28 X 28 images to a vector of 1, 784
X_train = X_train.reshape(-1, 784)X_test = X_test.reshape(-1, 784)
#Check shapeX_train.shape, X_test.shape

4. Normalise your data

Then we use Scikit-Learn’s MinMaxScaler to normalise our pixel data, which usually ranges from 0–255. After normalisation, the values will range from 0–1, which greatly improves results.

from sklearn.preprocessing import MinMaxScaler
def scaleData(data):           scaler = MinMaxScaler(feature_range=(0, 1))    return scaler.fit_transform(data) 
X_train = scaleData(X_train)X_test = scaleData(X_test)

5. Build the network

Next, we build our network with Keras, defining an appropriate input shape, then stacking some Convolutional, Max Pooling, Dense and dropout layers, as shown below. (Some neural network basics : Do make sure that your last layer has the same number of neurons as your output classes. Since we are predicting handwritten digits, ranging from 0–9, we have a Dense layer of 10 neurons as our last layer here.)

input_shape = (1,28,28) if K.image_data_format() == 'channels_first' else (28,28, 1)
keras_model = Sequential()keras_model.add(Conv2D(32, kernel_size=(5, 5), activation='relu', input_shape=input_shape, padding='same'))keras_model.add(MaxPooling2D(pool_size=(2, 2)))keras_model.add(Conv2D(64, (5, 5), activation='relu', padding='same'))keras_model.add(MaxPooling2D(pool_size=(2, 2)))keras_model.add(Flatten())keras_model.add(Dense(512, activation='relu'))keras_model.add(Dropout(0.5))keras_model.add(Dense(10, activation='softmax'))keras_model.summary()

If you see this summary of the Keras model below, your all good so far.

5. Create a SystemML model

Use the Keras2DML wrapper and feed it our freshly built Keras network. This is done by calling the Keras2DML method and feeding it your spark session, Keras model, its input shape, and the predefined variables. The variable ‘epoch’ denotes the number of times your algorithm iterates over the data. Next, we have ‘batch_size’, which indicates the number of training examples our network will see per learning batch. Finally, ‘samples’ simply encodes the number of samples in our training set. We also ask to be displayed the training results every 10 iterations.

Then we use thefitparameter on our newly defined SystemML model, and pass it the training arrays and labels to initiate our training session.

from systemml.mllearn import Keras2DML
epochs = 5batch_size = 100samples = 60000max_iter = int(epochs*math.ceil(samples/batch_size))
sysml_model = Keras2DML(spark, keras_model, input_shape=(1,28,28), weights='weights_dir', batch_size=batch_size, max_iter=max_iter, test_interval=0, display=10)
sysml_model.fit(X_train, y_train)

Now, you should see something like this appear on your screen:

6. Time to score! We do this by simply calling the score parameter on our trained SystemML model, like so:

sysml_model.score(X_test, y_test)

Wait for the spark job to execute, and then, voila! you should see your accuracy on the test set appear. As you can see below, we have achieved one of 98.76, not too bad.

Note that we were able to deploy a Keras model through SystemML’s Keras2DML wrapper, which essentially serialises your model to a Caffe model, then converts that model to a declarative machine learning script. The same Keras model would otherwise be bound by the resources of a single JVM, had you chosen to train it with Keras, without significantly adapting your code for parallel processing. Neat, no? You can now train your neural networks on local GPUs , or use a cloud machine like we did on Watson studio.

While it always feels nice to pack some local firepower in terms of processing, nothing beats the cloud. You can really scale up your projects, and choose appropriate machine configurations and spark plans at a fraction of the cost of hardware alternatives. This is ideal for dealing with different environments and use cases which highly variant demands, ranging from small scale data visualisation, to big data projects requiring real time analytics of petabytes of data. Maybe your just trying to analyse a copious amount of IoT data from your distributed warehouse network, like Wallmart. Or maybe your peaking into subatomic depths, trying to determine the fabric of our cosmos, like the CERN. In any of these widely varying use cases could benefit from migrating their computations to the cloud, and very likely have done so.