Skip to Content
Technical Articles
Author's profile photo Frank Schuler

Build your first SAP Data Intelligence ML Scenario with TensorFlow

Inspired by Andreas Forster’s excellent blog SAP Data Intelligence: Create your first ML Scenario and encouraged by Karim Mohraz’s incredibly helpful blog Train and Deploy a Tensorflow Pipeline in SAP Data Intelligence, in this blog I will combine both their approaches to demonstrate how to create an as plain vanilla as possible SAP Data Intelligence machine learning scenario with TensorFlow.

To start with, I create a Data Workspace and respective Data Collection in the SAP Data Intelligence ML Data Manager and upload Andreas’s training data there:

From my Jupiter Lab Data Manager, I get to my Data Collection and copy the code to load my training data:

import pandas as pd
import sapdi
ws = sapdi.get_workspace(name='architectSAP')
dc = ws.get_datacollection(name='architectSAP')
with dc.open('RunningTimes.csv').get_reader() as reader:
    df = pd.read_csv(reader, sep=';')
df.head()
	ID	HALFMARATHON_MINUTES	MARATHON_MINUTES
0	1	73	149
1	2	74	154
2	3	78	158
3	4	73	165
4	5	74	172

On that basis, I build my data set:

x = df[['HALFMARATHON_MINUTES']]
y_true = df[['MARATHON_MINUTES']]
import tensorflow as tf
dataset = tf.data.Dataset.from_tensor_slices((x.values, y_true.values))
dataset = dataset.batch(1)
for feat, targ in dataset.take(5):
    print('Features: {}, Target: {}'.format(feat, targ))
print(x.shape)
print(y_true.shape)
Features: [[73]], Target: [[149]]
Features: [[74]], Target: [[154]]
Features: [[78]], Target: [[158]]
Features: [[73]], Target: [[165]]
Features: [[74]], Target: [[172]]
(117, 1)
(117, 1)

To create, compile and train my model:

model = tf.keras.Sequential([tf.keras.layers.Dense(4, name='hidden', batch_size=1, input_shape=(1,)), tf.keras.layers.Dense(1, name='output')])
model.compile(tf.keras.optimizers.Adam(), tf.keras.losses.MeanSquaredError())
model.summary()
history = model.fit(dataset, epochs=8)
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
hidden (Dense)               (1, 4)                    8         
_________________________________________________________________
output (Dense)               (1, 1)                    5         
=================================================================
Total params: 13
Trainable params: 13
Non-trainable params: 0
_________________________________________________________________
Epoch 1/8
117/117 [==============================] - 0s 2ms/step - loss: 80740.5312
Epoch 2/8
117/117 [==============================] - 0s 3ms/step - loss: 57319.2930
Epoch 3/8
117/117 [==============================] - 0s 3ms/step - loss: 36689.4297
Epoch 4/8
117/117 [==============================] - 0s 3ms/step - loss: 18188.9629
Epoch 5/8
117/117 [==============================] - 0s 3ms/step - loss: 6463.0679
Epoch 6/8
117/117 [==============================] - 0s 3ms/step - loss: 1668.1143
Epoch 7/8
117/117 [==============================] - 0s 3ms/step - loss: 459.1681
Epoch 8/8
117/117 [==============================] - 0s 4ms/step - loss: 292.9158

With very similar results to Andreas’s of course (red) versus the MSE optimum (green), but this time leveraging TensorFlow Keras:

import matplotlib.pyplot as plot
import numpy as np
m, b = np.polyfit(np.squeeze(x), y_true, 1)
plot.scatter(x, y_true);
plot.plot(x, model.predict(x), color = 'red');
plot.plot(x, m*x + b, color = 'green');
plot.xlabel("Actual Minutes Half-Marathon");
plot.ylabel("Actual Minutes Marathon");

And check that there is no auto correlation, by scatter plotting the residuals to verify their randomness:

plot.scatter(x, y_true - model.predict(x), color="orange");
plot.xlabel("Actual Minutes Half-Marathon");
plot.ylabel("Residuals");

For leveraging these results, in the SAP Data Intelligence ML Scenario Manager, I add a Python Producer to create, compile, train and store my model. Since I want to stay as plain vanilla as possible, I only add a few lines to the template and stick with its naming conventions:

import pandas as pd
import tensorflow as tf
import io
import json
import h5py

# Example Python script to perform training on input data & generate Metrics & Model Blob
def on_input(data):
    # to send metrics to the Submit Metrics operator, create a Python dictionary of key-value pairs
    df = pd.read_csv(io.StringIO(data), sep=';')
    x = df[['HALFMARATHON_MINUTES']]
    y_true = df[['MARATHON_MINUTES']]
    dataset = tf.data.Dataset.from_tensor_slices((x.values, y_true.values))
    dataset = dataset.batch(1)
    model = tf.keras.Sequential([tf.keras.layers.Dense(4, batch_size=1, input_shape=(1,)), tf.keras.layers.Dense(1)])
    model.compile(tf.keras.optimizers.Adam(), tf.keras.losses.MeanSquaredError())
    history = model.fit(dataset, epochs=8)
    # metrics_dict = {"kpi1": "1"}
    metrics_dict = json.dumps({'loss': str(history.history['loss'][len(history.history['loss']) - 1])})

    # send the metrics to the output port - Submit Metrics operator will use this to persist the metrics
    api.send("metrics", api.Message(metrics_dict))

    # create & send the model blob to the output port - Artifact Producer operator will use this to persist the model and create an artifact ID
    f = h5py.File('blob', driver='core', backing_store=False)
    model.save(f)
    f.flush()
    # model_blob = bytes("example", 'utf-8')
    model_blob = f.id.get_file_image()
    api.send("modelBlob", model_blob)
    
api.set_port_callback("input", on_input)

Since I use the TensorFlow Python libraries, I need to add a Group with a tag to identify my Docker image:

FROM $com.sap.sles.ml.python
RUN python3.6 -m pip --no-cache-dir install --user --upgrade pip
RUN python3.6 -m pip --no-cache-dir install --user tensorflow

I then execute my Python Producer with these parameters after changing the Connection in the Read File Operator:

To obtain my Metrics, Models and Datasets:

To consume these, in the SAP Data Intelligence ML Scenario Manager, I add a Python Consumer. Since I still want to stay as plain vanilla as possible, I only add a few lines to the template to apply my model and obtain my results:

# apply your model
blob = io.BytesIO(model)
f = h5py.File(blob, 'r')
architectSAP = tf.keras.models.load_model(f)
blob.close() 
# obtain your results
prediction = architectSAP.predict([[json.loads(user_data)['half_marathon_minutes']]])
success = True

As well as pass the successful response to the user:

# apply carried out successfully, send a response to the user
# msg.body = json.dumps({'Results': 'Model applied to input data successfully.'})
msg.body = json.dumps({'marathon_minutes_prediction': str(prediction[0])})

I then deploy my Python Consumer after adding a Group for my Docker image again and passing in my model:

Once my Deployment is successful:

I retrieve my prediction e.g. with Postman:

I tried to keep this blog as plain vanilla as possible, to help you understand the underlying basic concepts. However, there is of course nothing wrong with a bit more sophistication like e.g. building a custom TensorFlow Operator to make your Graphs more efficient and easier to read:

Assigned Tags

      4 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Andreas Forster
      Andreas Forster

      Hi Frank Schuler , Thank you for this introduction to TensorFlow in DI! Keeping steps to the bare bones helps me get into new topics, I will try it out soon 🙂

      Author's profile photo Sergio Peña
      Sergio Peña

      Hi Frank. It is a great blog. TensorFlow and Keras are being used a lot today.

      I have some question:

      What is the DI version? 3.0.2?

      This tutorial, can it be done with the free version of DI?

      Thanks and congratulations

      Author's profile photo Frank Schuler
      Frank Schuler
      Blog Post Author

      Thank you, Sergio,

      My blog is based on SAP DI 3.0.3, but 3.0.2 should work as well as the SAP Data Intelligence, trial edition 3.0.

      Best regards

      Author's profile photo Snehit S Shanbhag
      Snehit S Shanbhag

      Hello Frank,

       

      Firstly thanks a for such detailed blog also thanks for adding a new flavor with TensorFlow.

      We see the URL for the deployed ML scenario is consumed in postman in this blog, i was curious

      1. How can i use the URL in other SAP tools, such that user can pass the data and see the output, which tool can we use ?
      2. Can we use this URL in SAC story?
      3. Can we use this URL in custom widget in SAC AD (for end user to communicate with ML scenario)?
      4. Can we build a UI5 app to consume this URL and avail end user to communicate with ML scenario?

       

      will be interested to know your perspective and also any other approach to consume this URL (apart from postman)

      Thank in advance.

       

      Best Regards,

      Snehit Shanbhag