Operationalize Deep Learning with Azure ML

Hello World!

So today we are going to do something really awesome.  Operationalize Keras with Azure Machine Learning.  Why in the world would we want to do this?  Well we can configure Deep Neural Nets and train them on GPU.  In fact, in this article, we will train a 2 depth neural network which outputs a linear prediction of energy efficiency of buildings and then operationalize that GPU trained network on Azure Machine Learning for production API usage.

So Why GPU?

Well on Azure we can get low level access to a single K80 and 12 cores for only $1.33/hour.  Beyond this, you get a series of choices.  Below are those choices.

VMSheet

Ok, so what is the realistic performance difference?  Well here is a pretty standard comparison chart.

K80 Marketing Image

Notice that tiny little blue sliver at 1x.  the K80 is 25x faster than top of the line CPUs.  In Azure you can get 2 physical K80s.  This is fantastic; 50x performance.

Why Azure Machine Learning

Azure Machine Learning is a great way to operationalize your deep nets.  Inference does not necessarily need GPU acceleration, however training does.  Azure ML will provide an easy way to scale up and out as well as generate security tokens.  Other nice things are the ability to tie directly into things like Stream Analytics.  By taking advantage of the Azure PaaS platform, you can tie into various components quickly and easily as they are designed to plug into each other quickly and easily.  By playing well with this platform it becomes significantly easier to add new components drastically reducing maintenance and time to market.

What is Keras?

Keras is a deep learning framework in which you can choose a Tensor Flow or Theano back end.  I definitely prefer the Theano back end.  Tensor Flow still does not work on Windows…Lame.  I like to be able to do my normal day job while I do deep learning, so Keras + Theano it is for single GPU workloads (CNTK for larger).  If you are looking for instructions for how to get Theano + Keras working on Windows 10, here you go.

The Keras documentation can be found here.  I found it easier to get up and running with the Sequential model first.  There is a ton of flexibility in the Sequential model and I have yet to need to go to functional.

Training The Model

So to train the model, it makes the most sense to start with some data.  You can download the data we are going to use here.

The first thing we need to do is to load the data into our environment for training.  Notice that we set y to be Heating Load, but we also drop out the Cooling Load column.  We do this because Cooling Load and Heating Load likely should be predicted together.  As both should likely be predicted together and are likely predictive of each other, we drop both out of our training data so that we aren’t “Cheating”.  You will also notice that we load in our testing data.  You should always split your data across training and testing.

import pandas as pd
import numpy as np
#Train Data
ee_train = pd.read_csv('C:\\data\\EE_Regression_Train.csv')
ee_train.drop('Cooling Load', axis = 1, inplace=True)
ee_y = ee_train['Heating Load']
ee_train.drop('Heating Load', axis = 1, inplace=True)

#Test Data
ee_test = pd.read_csv('C:\\data\\EE_Regression_Test.csv')
ee_test.drop('Cooling Load', axis = 1, inplace=True)
ee_test_y = ee_test['Heating Load']
ee_test.drop('Heating Load', axis = 1, inplace=True)

Next we build up the model graph.  This code does not execute immediately, but rather builds a compute graph which when compiled generates all the partial derivatives and cuda code we need to train the model.  If you have ever done this before by hand, it is painful and having a tool that can do this for us that works on Windows 10 is fantastic!

from keras.models import Sequential
from keras.layers import Activation, Dropout, Flatten, Dense

dataLen = len(ee_train.columns.values)

model = Sequential()
model.add(Dense(256, input_dim=dataLen))
model.add(Activation('relu'))

model.add(Dense(256, input_dim=dataLen))
model.add(Activation('relu'))

model.add(Dense(1))
model.add(Activation('linear'))

model.compile(optimizer='rmsprop',
              loss='mse')

Without getting too deep here, basically what we are doing is using the Sequential API in Keras for Neural Networks.  Our first layer has an input layer, which is the length of the data plus a 256 node fully connected layer.  We then add a second 256 node fully connected layer.  The final layer has a single node with a linear activation function.  This allows us to produce regression like outputs as opposed to classifier outputs.  Finally we use a gradient descent variant optimizer function, “rmsprop”.  We use mean squared error as our loss function.

The final step here is to transform the inputs into numpy arrays, tanspose the y to be the correct form (its a linear algebra thing), fit the model and then see what our performance is like.  Notice that we also transform our test data.

train_x = ee_train.as_matrix()
train_y = ee_y.as_matrix().transpose()
test_x = ee_test.as_matrix()
test_y = ee_test_y.as_matrix().transpose()

model.fit(train_x, train_y, validation_data=(test_x, test_y), nb_epoch=10000, batch_size=576)
train_y.var()

Run Some Code

If you are using a GPU w/ Theano backend, you should see a message similar to below after importing anything from Keras.

usingGPU

As soon as you perform “model.fit” you will see numbers begin flying across the screen.  You will notice the loss and val_loss initially decreasing together.  BUT OH NO, val_loss starts increasing and loss continues to decrease!  This means our model has started over fitting, probably around epoch 5,000 or so.  We need to simplify our data, but also there is categorical data etc.  This article is not about perfecting a model, so we will move on; but this demonstrates the need for a validation set and why having GPU is so beneficial in this phase, as if you were to do this on CPU, you are easily at 100x longer to perform the same task.

keras_train_output

Just to show it can be done, here it is with some very minor fine tuning…

keras_model_better

Our variation in values is 102, but our mse (not rmse) is 12; so really, our error is in fact 3.46, FANTASTIC!  This is actually not half bad.  Ok, we have a good one, lets operationalize this thing.

Persisting a Keras Model

So, we need to persist this thing a little bit different than the way you see on the docs page.  I really wish Keras would persist like this out of the box.  We are going to persist the model architecture and the model weights separately.  The reason we do this is to support either GPU or CPU as well as not be reliant on this crazy .h5 storage thing.  We just want regular ol’ out of the box stuff with minimal dependencies.

So here we go…

##################
# Persist Model #
#################
from keras.models import model_from_json
import pickle
model_json = model.to_json()
model_weights = model.get_weights()
pickle.dump(model_json, open('C:\\data\\EE_Keras_Model\\model_json.pkl', 'wb'))
pickle.dump(model_weights, open('C:\\data\\EE_Keras_Model\\model_weights.pkl', 'wb'))

So just to verify it all works locally before we move to the cloud, lets try creating a verify model and ensure it predicts the same values our trained model predicts.

#Test Loading Persisted Model
verify_weights = pickle.load(open('C:\\data\\EE_Keras_Model\\model_weights.pkl', 'rb'))
verify_model_json = pickle.load(open('C:\\data\\EE_Keras_Model\\model_json.pkl', 'rb'))
verify_model = model_from_json(verify_model_json)

verify_model.set_weights(verify_weights)
verify_model.predict(train_x)
model.predict(train_x)

Excellent!  Everything looks good so far.  Time to put this into Azure ML.

Zipping The Files

To get Theano + Keras + Our Model into AzureML, we need to zip it all up.  Now as always, its not as simple as stick it in a folder and zip that.  Who knows why, but it is what it is.  You need to select all items in a flat view as shown below and then zip those together and name the resulting zip what you want.

files to zip_keras_theano

I named the resulting zip file: “keras_theano_ee_model.zip”.  I have uploaded some files so you can more easily do this:

  1. keras + theano zip
  2. keras + theano + model zip
  3. model json + weights zip

Azure ML – Part 1 – Upload Data

I will not be doing an intro to Azure ML.  I am assuming you know how to create a blank experiment.  Begin by creating a new data set and uploading your files.  Begin by clicking the NEW button in the bottom left corner and selecting Data Set.

AzureML_UploadData

Select the zip which contains keras + theano + the operationalized model.

AzureML_PickLocalFile

Repeat this process with the .csv file used to train the model.  This will be our data set for identifying the schema.

Azure ML – Part 2 – Create Experiment

  1. Create a new experiment.
  2. Drag and drop the .csv file into the work space
  3. Drag and drop the .zip file into the work space
  4. Drag and drop an “Execute Python Script” into the work space.  Change the Anaconda/Python version to 4.0/3.5.
  5. Inside the python script add the following imports:
from keras.models import model_from_json
import pickle

Your experiment should like as below.

kerastheano_azureml_part1

You should get a green checkbox.  If you did, CONGRATULATIONS!  You have theano + keras + your model in Azure ML.  This is one of the harder parts.  If not, the biggest thing to check is that you zipped your files EXACTLY as described in the zipping section.

Azure ML – Part 3 – Python Code to Operationalize Model

The code is almost exactly the same as we had in our experiment on our GPU box to load the model and test it.  Here is the code within the Azure ML framework.

 

import pandas as pd
import numpy as np
from keras.models import model_from_json
import pickle

def azureml_main(dataframe1 = None, dataframe2 = None):
    dataframe1.drop('Cooling Load', axis = 1, inplace=True)
    dataframe1.drop('Heating Load', axis = 1, inplace=True)
    weights = pickle.load(open('./Script Bundle/model_weights.pkl', 'rb'))
    json = pickle.load(open('./Script Bundle/model_json.pkl', 'rb'))
    model = model_from_json(json)
    model.set_weights(weights)
    x = dataframe1.as_matrix()
    result = pd.DataFrame(model.predict(x))
    
    return result,

Notice how we drop the Cooling Load as well as the Heating Load in the Python code.  We could have done this with a drag and drop module in the Azure ML workspace, which would have been more ideal as it will reduce our api input parameters.  I’ll leave that to you to figure out :D.

Notice that we look into the /Script Bundle/ folder.  All extracted files are dropped here.  If a python library is dropped here, it is usable immediately just as we did with Keras.  Numpy and pickle come out of the box.  lets take a look at the visualized results…

kerastheano_azureml_results

Producing a Web Endpoint

Drag and drop a Web Service Input & Web Service Output into the experiment and link up to the input and output of the python module as shown below and push the Deploy Web Service Button.  You may need to run the experiment after dropping the web service modules into the work space so the deploy button becomes available.

kerastheano_azureml_experiment_ops

Create your Name and select your pricing plan.

AzureML_WebServiceName

Try out the EndPoint Request Response Tester!

AzureML_TestProduction

I used the dummy values and got back a bad value (of course).  You now get all the goodness of Azure ML and the dev portals etc etc.

Suggested Next Steps

Try standing up API Management in front of the Azure ML API and add Stripe payment processing :D.

 

4 thoughts on “Operationalize Deep Learning with Azure ML

  1. Hi David, I just used your brilliant post to operationalize a Keras model trained with the TensorFlow backend, and can confirm that this works just as well with TF as it does with Theano. The only thing I had to do differently was to add the “google” package to the zip, and of course replace Theano with the CPU version of TF.

    I wonder how well this approach would work for live scoring in a production scenario. The Azure ML backend appears to persist the model in memory at least for for a little while after making a prediction, and as long as the web service is “kept” warm, it returns predictions within a few seconds. I would like to get the best of both worlds and take advantage of Keras/TF/XGBoost for modeling and Azure ML for web service endpoints, and it appears to work very well, but I’m a bit concerned about how well it would scale. What are your thoughts on this?

    • I’ve been working on operationalization via Service Fabric with a stateless C# Asp.net Core web gateway and a guest execute-able python program which listens to the Azure Service Bus. Versioning of the model weights is done via a remote look up table. I’ll have full docs on this later; the example will be posted here as I am building it out: https://github.com/drcrook1/SecureSystems

Leave a Reply

Your email address will not be published. Required fields are marked *