Skip to Content
Personal Insights
Author's profile photo Stojan Maleschlijski

Embedding HANA machine learning models into SAP Data Warehouse Cloud views

In this blog post we describe a simple approach on how data scientists can expose already trained HANA machine learning (ML) models to SAP Data Warehouse Cloud (SAP DWC). Implementing the proposed method allows business analysts to apply ML models on-demand according to their needs (i.e. for generation of reports, dashboards, or other BI activities) in a  “no-code” manner, using the data modeling capabilities of SAP DWC (i.e. the data builder component).

To demonstrate the idea, we train a simple time series forecasting model and expose it for consumption following the proposed approach. The developed model predicts the future number of notifications originating from a shop floor device, based on a dataset containing 2 years of historical data. The model creation is not reviewed in detail here, so if you want to learn more about how to create ML models, make sure to check the blog post of Andreas Forster and also the post of Yannick Schaper.

Preparation steps

To start with, we assume that the data needed for machine learning has already been shared in a space in the SAP DWC tenant, which we will call MLSPACE. In addition, a database user, MLUSER, has been created. This user has read access to the MLSPACE as well as read/write access to the open SQL schema MLSPACE#MLUSER.

As a next step, we will review the process from accessing the data, via data modeling up to model creation and exposure for consumption. The figure below shows the main steps (in yellow) in the order they are described in the following sections:

Process overview

1. Accessing the data

The data scientist can connect to the underlying HANA system using Jupyter Lab and Python with the hana_ml package and access the open SQL schema of the user:

hana_port = 443 
hana_user = 'MLSPACE#MLUSER'
hana_password = [YOUR_PASSWORD]
hana_address = [YOUR_DWC_HOST]

# Instantiate connection object
conn = dataframe.ConnectionContext(address = hana_address,
                                   port = 443, 
                                   user = hana_user, 
                                   password = hana_password)

# Control connection

Also the input data located in the read-only schema can be previewed using the following code snippet:

df_remote = conn.table('AssetForAnalytics', schema = 'MLSPACE')

Input data

2. Creating the ML model

This step usually involves several iterations to identify the optimal data as well as the features needed for modeling. Furthermore, the model selection and tuning require a substantial amount of work to complete. Since the focus of this blog post is on model sharing, rather than on model creation, these steps are skipped.

So, the data scientist decides to go with Exponential Smoothing from the HANA Predictive Analysis Library (PAL) for the time series forecasting. As an example of a simple data preparation an ID column is generated. Finally the model is created and the fit_predict() method is called to predict the amount of device notifications from the shop floor for the next 20 time intervals.

df_remote = conn.table('AssetForAnalytics', schema = 'MLSPACE')
df_remote_id = df_remote.sort('DateTime').add_id()
from hana_ml.algorithms.pal.tsa.exponential_smoothing 
import AutoExponentialSmoothing
autoexpsmooth = AutoExponentialSmoothing(model_selection=True,
df_res = autoexpsmooth.fit_predict(df_remote_id, key='ID')

3. Deploying ML models in a view

To be able to trigger and apply an ML model from SAP DWC the model needs to be embedded into a view. The process to achieve this involves the following two steps:

3.1 Creating a table function

The first step is to create a table function, which wraps the call of the apply function of the PAL ML model (here Exponential Smoothing, i.e. _SYS_AFL.PAL_AUTO_EXPSMOOTH). One can write this call in SQL from scratch or use a handy functionality of the hana_ml package, which returns the SQL execute statement of the model object. The returned string contains already most of the SQL script required for the table function definition.

DECLARE param_name VARCHAR(5000) ARRAY;
DECLARE string_value VARCHAR(5000) ARRAY;
param_name[1] := N'MODELSELECTION';
int_value[1] := 1;
param_name[2] := N'FORECAST_MODEL_NAME';
string_value[2] := N'TESM';
param_name[3] := N'FORECAST_NUM';
int_value[3] := 20;
param_name[4] := N'CYCLE';
int_value[4] := 365;
param_name[5] := N'TRAINING_RATIO';
double_value[5] := 1.0;
params = UNNEST(:param_name, :int_value, :double_value, :string_value);
in_0 = SELECT "ID", "Notifications" FROM (SELECT "ID", "Notifications", "DateTime" FROM (SELECT "ID", "AssetID", "Notifications", "DateTime" FROM (SELECT CAST(ROW_NUMBER() OVER() AS INTEGER) + 0 AS "ID", * FROM (SELECT * FROM (SELECT * FROM "MLSPACE"."AssetForAnalytics") AS "DT_24" ORDER BY "DateTime" ASC)) AS "DT_26") AS "DT_27") AS "DT_28";
CALL _SYS_AFL.PAL_AUTO_EXPSMOOTH(:in_0, :params, out_0, out_1);

As mentioned, the generated string can be used as it is, except for the last two lines, which create temporary tables to hold the results and statistical properties of the ML model. Those lines need to be substituted with a statement to return the results directly. Finally the table function definition header is added. The following SQL code shows the changes required:

-- Add the function definition
                    LANGUAGE SQLSCRIPT
DECLARE param_name VARCHAR(5000) ARRAY;
-- See previous SQL box for details
CALL _SYS_AFL.PAL_AUTO_EXPSMOOTH(:in_0, :params, out_0, out_1);
-- Add the return statement
RETURN select TIMESTAMP, VALUE from :out_0; 

3.2 Creating a view out of the table function

The second step is to create a view, which queries the table function. Through the view the table function is visible in the SAP DWC space. The view can be created with the following SQL code:

as ( 
select * 

The data scientist can use here directly the following helper function in Python to perform the two steps, thus avoiding the SQL scripting part. It is important to mention that this helper function will only execute without any errors for this type of ML model. Of course the logic of the function can be adapted to support also more complex models (e.g. when they require access to model repository).

def deployExpSmoothToDWCView(sql_cmd, func_name, view_name, 
                             schema_name, hana_address, 
                             hana_user, hana_password, append=False):
    sql_interm = sql_cmd.split('\n')
    del sql_interm[-4:]
    del sql_interm[0]
    sql_interm.append('RETURN select TIMESTAMP, VALUE from :out_0; \n END')
    sql_string = '\n'.join(sql_interm)
    command = 'CREATE'
    if append:
        command= 'ALTER'
    sql_view = ''' %s view "%s"."%s" as ( select * from  "%s"."%s"() ) '''%(command, schema_name, view_name, schema_name, func_name)

    sql_t_func = '''%s FUNCTION "%s"."%s"()
                    LANGUAGE SQLSCRIPT
                    '''%(command, schema_name, func_name, sql_string)
    from hdbcli import dbapi
    conndb = dbapi.connect(
    cursor = conndb.cursor()
    stat_func = cursor.execute(sql_t_func)
    stat_view = cursor.execute(sql_view)
    return stat_func, stat_view

The function needs to be called directly after the execution of the fit_predict() method, otherwise the returned sql execute statement will be an empty string. An example on how to use the function is shown here:

df_res = autoexpsmooth.fit_predict(df_remote_id, key='ID')
sql_cmd = autoexpsmooth.execute_statement
                         hana_address, hana_user, hana_password, True)

4. Accessing the ML model from SAP DWC

If the previous steps were successful, the data scientist has managed to expose the model for other users in SAP DWC. Switching back to SAP DWC, we see that the view APPLY_TIMESERIES_VIEW can now be imported in the MLSPACE from the remote sources. Once added to the SAP DWC space, the view can be used for any further data modeling tasks in the data builder component.


Importing the view into MLSPACE


This blog post demonstrates an approach on how ML models, developed in the underlying HANA database, can be exposed and applied from the SAP Data Warehouse Cloud layer in an ad-hoc manner via a table function and a view. This is a lightweight scenario, which can be useful for example to allow business analysts to incorporate ML models on-demand directly into their data models and analysis in a self-service, no-code manner.

It is important to mention that there are cases, where this approach is not suitable. For example, when the model needs to be made available for use in a database application. In this case the blog post of Frank Gottfried shows the best practices approach. Also the post of Wei Han presents a different approach, where the model is applied in the database directly on the input data and the results of this application are saved in a table, which is imported into the SAP DWC space.

I want to thank Dimitrios KostakisKostas for providing his data engineer’s view during the discussions related to this blog post!

Assigned Tags

      You must be Logged on to comment or reply to a post.
      Author's profile photo Farooq Azam
      Farooq Azam

      Greatly informative and very well written article. Thank you for sharing.

      Author's profile photo Stojan Maleschlijski
      Stojan Maleschlijski
      Blog Post Author

      Thanks a lot, Farooq!