Skip to Content
Technical Articles
Author's profile photo Sumin Lee

AI on mobile: Powering your field force – Part 3 of 5

Building deep learning model within SAP BTP

Dear readers, 

This is a blog series from the Meter Reading project and we attempted to discover the possibilities for ML models in a web browser. This long journey has provided many lessons learned in terms of deep learning and deployment in web browsers. As a beginner in computer vision, I would like to share the lessons I learned from this project. I would like to special thank you for our project members – Aditi Arora, Vriddhi Shetty, Gunter Albrecht, Anoop Mittal, Roopa Prabhu Nagavara,, Naman, Gupta and S, Deepak Chethan.

In the previous AI on mobile: Powering your field force – Part 2 of 5, Anoop Mittal,, we read about the high-level overview on basic design principles.

Then, how can we extract digits from a captured image📷  ?

We can build the deep learning model on the SAP Data Intelligence and write the model to the SAP HANA Cloud. To build the deep learning model for digit recognition, we have five steps as follows. Are you ready to start meter reading journey with SAP BTP?

Understand Scenarios

In this posting, we will build the deep learning model with the meter reading scenario on SAP BTP.

Automatic Meter Reading (AMR) is a technology that automatically collects consumption and status data from a water meter or energy metering devices (gas, electric).

AMR has a two-step approach for object detection and object recognition. Region of interest (ROI) is the detection of digits, and many studies have adopted YOLO, a CNN-based object detection system, to detect digits.

In this scenario, we developed a front-end application with a bounded region that captures a region of interest (ROI), thus we decided to consider only an object recognition approach. For the object recognition, we used a deep learning model, which is part of a broader machine learning model using neural network in the SAP BTP ecosystem.


Check Requirements

python == 3.7.0


opencv-python ==


import os
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.layers import LSTM #python ==3.7 mandatory, (python==3.8.12, not mandatory)
from PIL import ImageFont, ImageOps, Image
import json 
import cv2
import random 
from hdfs import InsecureClient
import io
from io import BytesIO
from PIL import Image
# Converting to TFLite
from tensorflow import lite


Create Training Pipeline

SAP Data Intelligence

We used SAP Data Intelligence for training pipeline and used Python Producer template in SAP Data Intelligence.


Training pipeline


Prepare data

For the data preparation, we create annotated images as captured by mobile application and this is the example images. We stored our image datasets in the data lake in SAP Data Intelligence.


Example of dataset

Split data

In the SAP Data Intelligence, we split the dataset into training and validation sets.

def split_data(images, labels, train_size=0.9, shuffle=True):
    # 1. Get the total size of the dataset
    size = len(images)
    # 2. Make an indices array and shuffle it, if required
    indices = np.arange(size)
    if shuffle:
    # 3. Get the size of training samples
    train_samples = int(size * train_size)
    # 4. Split data into training and validation sets
    x_train, y_train = images[indices[:train_samples]], labels[indices[:train_samples]]
    x_valid, y_valid = images[indices[train_samples:]], labels[indices[train_samples:]]
    return x_train, x_valid, y_train, y_valid



In the SAP Data Intelligence, we get ‘image path’ from the data lake, so we need to and please refer to the example python code. This is different from the local python code!

# to send metrics to the Submit Metrics operator, create a Python dictionary of key-value pairs
client = InsecureClient('http://datalake:50070')


For image preprocessing, we decode and resize the image, convert the dtype and adjust the image brightness, sharpness and quality before training as follows:

def encode_single_sample(img_path, label):
    img_path = tf.get_static_value(img_path).decode('utf-8')
    label = tf.get_static_value(label).decode('utf-8')
    with as reader: 
        # 1. Read image
        img_arr =
        img = np.array(img_arr)
        # 2. Decode or convert to grayscale
        img = tf.image.decode_png(img, channels=3)
        # 3. Convert to float32 in [0, 1] range
        img = tf.image.convert_image_dtype(img, tf.float32)
        # 4. Resize to the desired size
        img = tf.image.resize(img, [img_height, img_width])
        # 4.1. brightness
        img = tf.image.adjust_brightness(img, delta=0.1)
        # 4.2 sharpness 
        img = tf.image.random_contrast(img, 1.2, 1.5)
        # 4.3 increase quality 
        img = tf.image.adjust_jpeg_quality(img, 75)
        # 5. Transpose the image because we want the time
        # dimension to correspond to the width of the image.
        img = tf.transpose(img, perm=[1, 0, 2])
        # 6. Map the characters in label to numbers
        label = char_to_num(tf.strings.unicode_split(label, input_encoding="UTF-8"))
        # 7. Return a dict as our model is expecting two inputs
        return {"image": img, "label": label}


Train dataset

We use the batch training with batch size=16 with for efficient training.

batch_size = 16
train_dataset =['image']), np.array(dataset_train['label'])))
train_dataset = (


Build Model

When capturing the images, recognizing the sequence of digits is important. We use Convolutional Recurrent Neural Network (CRNN) to build deep learning model. Since CRNN models are network architectures specifically designed to recognize sequence-like objects in images, CRNN is suitable for our AMR scenario. CRNN is effective with smaller model and remarkable performance without predefined lexicon, compared to conventional methods such as CNN and RNN based algorithms.



CRNN architecture(reference link to here)


By using the maximum length of the captcha in the meter reading image instead of a predefined dictionary, this approach produces predictions of varying lengths.

# Maximum length of any captcha in the dataset
max_length = max([len(label) for label in train_labels])

# Create characters 
characters = ['0','1', '2', '3', '4', '5', '6', '7', '8', '9']

# Mapping characters to integers
char_to_num = layers.experimental.preprocessing.StringLookup(
    vocabulary=list(characters), num_oov_indices=0, mask_token=None


We define that the size of the input image is 350 X 50 and channel is 3 as a tensor.

Model: "model_v1"
Layer (type)                    Output Shape         Param #     Connected to                     
image (InputLayer)              [(None, 350, 50, 3)] 0                                            
Conv1 (Conv2D)                  (None, 350, 50, 32)  896         image[0][0]                      
pool1 (MaxPooling2D)            (None, 175, 25, 32)  0           Conv1[0][0]                      
Conv2 (Conv2D)                  (None, 175, 25, 64)  18496       pool1[0][0]                      
pool2 (MaxPooling2D)            (None, 87, 12, 64)   0           Conv2[0][0]                      
reshape (Reshape)               (None, 87, 768)      0           pool2[0][0]                      
dense1 (Dense)                  (None, 87, 64)       49216       reshape[0][0]                    
dropout_4 (Dropout)             (None, 87, 64)       0           dense1[0][0]                     
bidirectional_8 (Bidirectional) (None, 87, 256)      197632      dropout_4[0][0]                  
bidirectional_9 (Bidirectional) (None, 87, 128)      164352      bidirectional_8[0][0]            
label (InputLayer)              [(None, None)]       0                                            
dense2 (Dense)                  (None, 87, 11)       1419        bidirectional_9[0][0]            
ctc_loss (CTCLayer)             (None, 87, 11)       0           label[0][0]                      
Total params: 432,011
Trainable params: 432,011
Non-trainable params: 0


Training model

### Get the model
model = build_model()

# Get the prediction model by extracting layers till the output layer
prediction_model = keras.models.Model(
    model.get_layer(name="image").input, model.get_layer(name="dense2").output


Convert to TFLite

We convert to the TFLite to integrate with the PWA app.

#Convert to TFLite 
converter = lite.TFLiteConverter.from_keras_model(prediction_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS
tfmodel = converter.convert()


Save model

This is the step to save artifact to SAP Data Intelligence Data Lake. We send the model blob to the output port and Artifact Producer operator will use this to persist the model and create an artifact ID. The interesting lessons learned was that the TFLite model itself is a byte type, so we don’t need to change it.

metrics_dict = {"kpi1": "1"}
# send the metrics to the output port - Submit Metrics operator will use this to persist the metrics 
api.send("metrics", api.Message(metrics_dict))

# create & send the model blob to the output port
model_blob = tfmodel
#model_blob = bytes(tfmodel, 'utf-8') # error 
api.send("modelBlob", model_blob)


Write to HANA

To read the SAP Data Intelligence artifact, we need pipeline to write to SAP HANA.

Create consumer pipeline in SAP Data Intelligence

To create writing to HANA pipeline, we use Python Consumer template in SAP Data Intelligence.


Writing to HANA pipeline


Lesson’s learned

✅  Check the data type

Checking the data type is really important. Please note the datatypes as each library support different datatypes.

✅  Check the requirements

I’ve been struggling with different types of errors. And I’d like to ask you to check your python version and tensorflow version to minimize errors.

import tensorflow as tf

✅  Collaboration

We tried to find a suitable approach for running an inference TFLite model on the web browser using C++ and WebAssembly backend. Discovering the appropriate architecture with new and innovative technologies is an amazing collaboration, and I have learned to collaborate with other areas.


Blog series

In the AI on mobile: Powering your field force – Part 4 of 5, Aditi Arora will explain how to load a TensorFlow Lite Machine Learning model in C++ and run inference on the web browser with the help of WebAssembly backend.

  1. In Part 1, Vriddhi Shetty describes the context of the meter reading projects
  2. In Part 2, Anoop Mittal, describes the solution backend, particularly how the service was created with CAP & Spring boot
  3. In Part 3Sumin Lee describes the intuition and steps behind creation of the ML model that will be referenced by the backend at run time
  4. In Part 4Aditi Arora describes how she created the web assembly binary and how she loaded the model in browser for inference
  5. In Part 5Naman, Gupta describes how he built the front end that stitches all working parts together, particularly how the PWA app was built with Angular


Shi, B., Bai, X., & Yao, C. (2016). An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE transactions on pattern analysis and machine intelligence39(11), 2298-2304.

Laroca, R., Barroso, V., Diniz, M. A., Gonçalves, G. R., Schwartz, W. R., & Menotti, D. (2019). Convolutional neural networks for automatic meter reading. Journal of Electronic Imaging28(1), 013023.

Salomon, G., Laroca, R., & Menotti, D. (2020, July). Deep learning for image-based automatic dial meter reading: Dataset and baselines. In 2020 International Joint Conference on Neural Networks (IJCNN) (pp. 1-8). IEEE.


Assigned Tags

      You must be Logged on to comment or reply to a post.
      Author's profile photo Yu Xuan Lee
      Yu Xuan Lee

      Not sure if you have tried before, but you could use convert your TF model to Tensorflow.js format for web browsers, if your UI framework allows Javascript. Anyway, great work for the post, it was quite interesting to read 👍

      Author's profile photo Sumin Lee
      Sumin Lee
      Blog Post Author

      Hi Yu Xuan! Thanks for your comments! We haven't tried the TF model in Tensorflow.js format, I can try with tensorflow.js for next project. Thanks for sharing your experience!