Skip to Content
Technical Articles
Author's profile photo Wei Han

SAP Data and Analytics Showcase – Develop a Machine Learning Application on SAP BTP

The series of blog posts is written by Frank Gottfried, Christoph Morgen and Wei Han together.

Overview

In this blog post, we’ll describe an end-to-end scenario, which demonstrates how to develop a SAP Cloud Programming Model (CAP) application in SAP Business Application Studio that leverages the machine learning capabilities (HANA PAL and APL library) from SAP HANA Cloud. Additionally, we’d like to showcase how our data and analytics solutions on SAP Business Technology Platform (SAP BTP) enable the collaboration between data scientist and app developer.

Following the upcoming sessions, you’ll be able to:

  • As data scientist: solve your business problem by exploring training/sample data located in a HANA database (e.g., HDI container), train and find out optimal machine learning models in Jupyter Notebook
  • As data scientist: utilise hana-ml library to automatically generate related design-time artefacts such as ML training procedure and grant file in Jupyter Notebook
  • As app developer: reuse the machine learning models from data scientist by importing all design-time artefacts into your CAP application in Business Application Studio and configure the database (db) module of your CAP project
  • As app developer: develop a Node.js script for model inference (prediction procedure) in Business Application Studio and deploy the CAP project into BTP sub-account for consumption

Use Case and Concept

As you may be aware, we have talked a lot how to combine prediction results from machine learning in your data modelling part and visualise the findings using dashboard or stories, e.g., in SAP Analytics Cloud. However, the use case described in this article is to demonstrate another essential application of machine learning – consume and leverage prediction results or findings directly in a business application, and further enhance with your own business logics.

Dataset

The dataset from one open website called “Tankerkönig“ is used in our case, which includes gasoline stations data in Germany and corresponding historical gasoline prices (E5, E10 and Diesel), all in CSV files. We use the stations and prices data within this website only for blog posting and demonstration purpose.

Scenario

We’d like to conduct massive time-series forecasting using the algorithm “Additive Model Time Series Analysis” from HANA PAL library to predict, for instance, the next 7-days Super E5 prices for gasoline stations in a specific area of Germany (e.g., Rhein-Neckar area), based on historical prices data from the tankerkönig repository.

The following two different scenarios are prepared depending on the places where training data sets are stored and CAP application is running. The only difference between these two scenarios is only the configuration of user-provided service on BTP and artefacts of database module of CAP project (Step D in the solution map). You can refer to our sub blog posts in the implementation part for more details

Sub-Scenario 1: Training Data and CAP Application Locate in the same HDI Container

The first scenario we described is a simple one, where your training data for machine learning is located in the same HDI container as your CAP project is running. In this case, you don’t need to access data from another schema of HANA Cloud instance.

 

Sub-Scenario 2: Training Data and CAP Application Locate in Different HDI Containers

The second scenario is more complex than the above one. It is quite often that the training data used for machine learning is located in a different place where your business application (test data) is running. As shown in the below figure, our training data for machine learning is stored in a HDI container (called HDI-A), whereas our CAP application is running under another container (HDI-B). HDI-A and HDI-B belong to the same HANA Cloud instance.

Solution Map and Implementation

Now let’s have a look at the below technical architecture and implementation steps. No matter where your training data is located, our approach follows in general 5 steps:

Blog 1 – Machine Learning (Step A – Step C)

Data scientist accesses training data, explores training data and finds out the optimal machine learning models via Jupyter notebook. Data scientist uses the hana-ml library to generate design-time artefacts (e.g., hana procedure, roles, synonyms) and pushes them to GitHub repository.

Blog 2 – Application Development (Step D – Step E)

App developer imports required artefacts into CAP project and configures the database module. App developer further implements logic to call trained ML model and get prediction results back in business application.

Useful Concept

1. SAP HANA Cloud Predictive Analytics Library (PAL) – Python Client API for Machine Learning (hana-ml)

Data scientist develops python scripts in Jupyter Notebook and uses the algorithm “Additive Model Time Series Analysis” from hana-ml library for time series forecasting. To run hana procedure successfully in your CAP application, the roles AFL__SYS_AFL_AFLPAL_EXECUTE and AFL__SYS_AFL_AFLPAL_EXECUTE_WITH_GRANT_OPTION need to be granted to your database user. For example, runtime user of HDI-B container shall be able to run PAL procedures after granting the above-mentioned roles in our sub scenario2. Please refer to this document for more information.

The module hana_ml.artifacts.generators.hana within hana-ml library handles generation of required HANA design-time artefacts based on the provided base and consumption layer elements. These artefacts can incorporate into development projects in Business Application Studio and be deployed via HANA Deployment Infrastructure (HDI) into a SAP HANA system. The design-time files such as grants, roles, procedures, tables, and synonyms are generated automatically in our case.

2. Access a Classic Schema from SAP Business Application Studio

We need configure the database (db) module of our CAP project (running in HDI-B container) in SAP Business Application Studio and BTP, to access the training data located in HDI-A container. For this purpose, you need to know:

  • how to create a user-provided service to access another database schema
  • how to grant permissions to the technical users in your HDI container to access the database

This tutorial from SAP Developers website helps a lot. Please check the steps such as use-provided services, grant permissions to technical users and synonyms in the tutorial.

3. Call HANA Procedure from CAP Node.js Application in SAP Business Application Studio

To use machine learning models in your CAP application, we can expose hana procedures as a CAP service function under oData and implement logic for this service function in a Node.js script, calling the Stored Procedure from SAP HANA Cloud.

Example JavaScript codes to call the procedures in CAP project is shown in the below figure (Source: Create HANA Stored Procedure and Expose as CAP Service Function (SAP HANA Cloud)). You can check the full tutorials for more details.

Conclusion

We hope this blog post could give you a comprehensive overview about – how to develop a machine learning application, by using capabilities of SAP HANA Cloud, SAP Cloud Application Programming Model, SAP Business Application Studio and SAP Business Technology Platform. Thank you for your time, and please stay tuned and curious about our upcoming blog posts!

At the very end, I would like to say thank you to my colleagues Frank Gottfried and Christoph Morgen to bring this great approach and make this end-to-end demo story happen together!

We highly appreciate for all your feedbacks and comments! In case you have any questions, please do not hesitate to ask in the Q&A area as well.

Assigned Tags

      6 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo SAMUELE BARZAGHI
      SAMUELE BARZAGHI

      Hi Wei Han ,

      Thank you for this blog, very useful, we are interested on implementing sub scenario 1.

      We already have a running Jupiter notebook connecting to a classic schema, do you have an example how to prepare an HDI container and how to connect to under schema by the notebook?

      Thanks

      Best Regards,
      Samuele

      Author's profile photo Wei Han
      Wei Han
      Blog Post Author

      Hi Samuele,

      Thank you for reaching out, and sorry for a little delay.

      We're almost ready to publish our blog post 1, where you'll get inspiration about your question. In my other blog posts, I used the ConnectContext from hana.ml to connect to a HDI container. You can look at the below example. Hope it helps.

      Best regards,

      Wei

      Author's profile photo SAMUELE BARZAGHI
      SAMUELE BARZAGHI

      Hi @Wei Han

      Thank you for your reply.

      I think that your example using ConnectContext is connecting to the schema "CPM_DEMO_202111#PYTHON" that is a classic schema not a HDI container, correct?

      We have difficult granting a user the DDL privileges for the HDI schema,
      do you have an help link to address this?

      We are waiting the part 1 eager, thank you again

      Best Regards,

      Sam

      Author's profile photo Wei Han
      Wei Han
      Blog Post Author

      Hi Sam,

      The part 1 has been published by my colleague Frank now. Could you please check if it helps you further?

      Best regards,

      Wei

      Author's profile photo SAMUELE BARZAGHI
      SAMUELE BARZAGHI

      Hi Wei Han,

      Sorry for delay, we were doing some tests.

      In Part 1 the connection is to a classic schema not to a HDI container schema.

      We were able to connect to the hdi schema granting manually the necessary permission using the GRANT_CONTAINER_SCHEMA_PRIVILEGES procedure:
      https://help.sap.com/docs/HANA_CLOUD_DATABASE/c2cc2e43458d4abda6788049c58143dc/d75182444361461992bcd331f3a16695.html?locale=en-US

      However we opted to a new different scenario, we connect to a classic schema that contains a  synonym that refer to the data table inside the hdi schema.

      In this way we have to grant to our user only the SELECT privilege and not the "CREATE ANY" privilege and all the objects generated by the python lib are created inside the classic schema keeping the hdi container clear from manually created objects.

      Any note about this scenario is welcome.

      Best regards,

      Sam

      Author's profile photo Peter Baumann
      Peter Baumann

      Hi Wei Han !

      Interesting showcase. I will compare with your scenario from 12/2021 with DWC by re-reading this blogs. Do you see that in general with DWC and HANA Cloud the same scenario should be possible?

      Am I right that this "Blog 1 (Upcoming Soon!) – Machine Learning (Step A – Step C)" is still missing? Or is it already written by someone else? Would be happy to see the full scenario.

      Thanks!

      Peter