Skip to Content
Technical Articles
Author's profile photo Andreas Forster

Hands-On Tutorial: Score your APL model in stand-alone JavaScript

Do you have business processes that require predictions at high speed? Possibly without using SAP HANA at prediction time? For example, you might have IoT data flying in on Kafka or MQTT, to which you must respond with immediate and tailored predictions in real-time. A new feature of the Automated Predictive Library (APL) is making this possible now.

The APL is a highly automated framework to train and score Machine Learning models in SAP HANA. It’s scaling the use of Machine Learning for operational processes. No data extraction or duplication required, the architecture is kept lean and the data remains securely in place.

Over time APL has evolved of course. Earlier versions of the APL were able to provide a scoring equation to obtain predictions in different programming languages such as Java, C++ or SQL. Newer versions of the APL included various improvements (i.e. use of Gradient Boosting, SHAP values for global and local explainability, multi-class classification….), but models trained with the latest framework could not be exported to such programming languages.

With the latest release of APL this has changed. Models trained with the new Gradient Boosting APL framework can now be scored in pure JavaScript. The model can now be scored wherever JavaScript can be executed, opening up many new deployment possibilities!

This blog was written by Marc DANIAU and Andreas Forster with kudos and thanks to Nai Minh QUACH, who has kindly provided an extremely helpful function to simplify this workflow.

Table of contents

Prerequisites

You must go through and implement the steps in the blog Hands-On Tutorial: Automated Predictive (APL) in SAP HANA Cloud. By the end of it, you have used Python in Jupyter Notebooks to load data to SAP HANA Cloud and you are able to train an APL classification model in SAP HANA. This model predicts whether a person is interested in purchasing a specific investment product from their bank.

To summarise, with the help of the “Python Client API for machine learning algorithm” (often called the “HANA ML wrapper) the model was trained as follows.

The hana_ml library has been installed. At the time of writing, the latest version is 2.6.20110600.

import hana_ml
print(hana_ml.__version__)

 

You can connect to your SAP HANA system. This blog is using SAP HANA Cloud, but the whole scenario also works with on-premise SAP HANA.

import hana_ml.dataframe as dataframe
conn = dataframe.ConnectionContext(userkey = 'MYHANACLOUD',
                                   encrypt = 'true')
# Send basic SELECT statement and display the result
sql = 'SELECT 12345 FROM DUMMY'
df_remote = conn.sql(sql)
print(df_remote.collect())

 

Training data is stored in the table BANKMARKETING.

df_remote = conn.table(table = 'BANKMARKETING', schema = 'ML').sort('CUSTOMER_ID', desc = False)
df_remote.head(5).collect()

 

Create and configure the GradientBoostingBinaryClassifier object. For details on the configuration please see the previous blog.

from hana_ml.algorithms.apl.gradient_boosting_classification import GradientBoostingBinaryClassifier
gbapl_model = GradientBoostingBinaryClassifier()
col_target = 'PURCHASE'
target_value = 'yes'
col_id = 'CUSTOMER_ID'
col_predictors = df_remote.columns
col_predictors.remove(col_target)
col_predictors.remove(col_id)
gbapl_model.set_params(eval_metric = 'AUC') # Metric used to evaluate the model performance
gbapl_model.set_params(cutting_strategy = 'random with no test') # Internal splitting strategy
gbapl_model.set_params(other_train_apl_aliases={'APL/VariableAutoSelection': 'true', 
                                                'APL/Interactions': 'true',
                                                'APL/InteractionsMaxKept': 10, 
                                                'APL/TargetKey': target_value})

 

And the model has been trained successfully.

gbapl_model.fit(data = df_remote, 
                key = col_id, 
                features = col_predictors, 
                label = col_target)

 

Trained APL model to JSON

The APL model has been trained. Now export the trained model’s logic into a JSON equation. This equation will be used in the next step to produce new predictions.

The hana_ml wrapper that was used to train the model does not provide a function to quickly obtain the JSON logic. It has to be obtained through SQL syntax. To simplify this step, Nai Minh Quach and Marc DANIAU created the following function, which takes care everything that’s needed.

def export_apply_code(model, **other_params):
    conn = model.conn_context.connection
    cursor = conn.cursor()

    # -- Header
    try:
        cursor.execute('drop table #FUNC_HEADER')
    except:
        pass
    cursor.execute('create local temporary table #FUNC_HEADER like "SAP_PA_APL"."sap.pa.apl.base::BASE.T.FUNCTION_HEADER"')

    # -- Export Parameters
    try:
        cursor.execute('drop table #EXPORT_CONFIG')
    except:
        pass
    cursor.execute('create local temporary table #EXPORT_CONFIG like "SAP_PA_APL"."sap.pa.apl.base::BASE.T.OPERATION_CONFIG_EXTENDED"')
    cursor.execute('insert into #EXPORT_CONFIG values (?, ?, NULL)', ['APL/CodeType', 'JSON'])
    cursor.execute('insert into #EXPORT_CONFIG values (?, ?, NULL)', ['APL/ApplyExtraMode', 'Advanced Apply Settings'])

    # -- Output table
    try:
        cursor.execute('drop table #APPLY_CODE_OUTPUT')
    except:
        pass
    cursor.execute('create local temporary table #APPLY_CODE_OUTPUT like "SAP_PA_APL"."sap.pa.apl.base::BASE.T.RESULT"')

    # Call APL SQL function
    sql =  """
    DO (
      IN header "SAP_PA_APL"."sap.pa.apl.base::BASE.T.FUNCTION_HEADER" => #FUNC_HEADER,
      IN config "SAP_PA_APL"."sap.pa.apl.base::BASE.T.OPERATION_CONFIG_EXTENDED" => #EXPORT_CONFIG,
      IN model  "SAP_PA_APL"."sap.pa.apl.base::BASE.T.MODEL_BIN_OID" => {model_table} )
    BEGIN
      "SAP_PA_APL"."sap.pa.apl.base::EXPORT_APPLY_CODE"(:header, :model, :config, out_code);
      EXEC 'insert into #APPLY_CODE_OUTPUT  select * from :out_code' USING out_code;
    END;
    """
    model_table_name = model.model_table_.name  # the temp table where the model is saved
    sql = sql.format(model_table=model_table_name)
    cursor.execute(sql)
    
    # Get the code generated
    cursor.execute('select to_char(VALUE) value from  #APPLY_CODE_OUTPUT')
    apply_code = cursor.fetchone()[0]
    return apply_code

 

Just pass the trained model into the function.

model_equation = export_apply_code(model=gbapl_model)

 

And you can write the JSON logic to file.

text_file = open("./bank_marketing_model.json", "w")
text_file.write(model_equation)
text_file.close()

 

The file is not meant to be human-readable, but of course you can have a look.

 

By the way, this blog is all about using Python to leverage the Machine Learning in SAP HANA. It is also possible though to obtain the above JSON representation of the model through SQL. See EXPORT_APPLY_CODE in the documentation.

 

JavaScript scoring

The above JSON logic is designed to be used by a JavaScript scoring runtime. Follow these two steps to get everything that is needed.

  1. Download that APL’s javascript runtime, which is shipping with the APL download beginning from version 2018.2. For this blog we downloaded APL version 2101 for SAP HANA 2.0 SPS03 and beyond (Linux on x86_64). In the samples folder of that download you find the files autoRuntime.js and dateCoder.js. In the File Browser of your JupyterLab create a folder called “lib” and copy these 2 files in there.
  2. Install the JavaScript package “amdefine” as explained in the readme.txt that is located in the same folder with the two .js files.

 

Install the package from the Notebook.

!npm install amdefine

 

Now copy and save this JavaScript code into a file called score_json_light.js. This file is very much simplified. It’s loading both the APL’s runtime as well as the JSON file that represents the trained model. A very simple new observation is then scored. Notice that it is only passing 3 predictors, even though the model contains additional predictor variables. The scoring equation is robust though and can deal with such missing values.

var runtime = require("./lib/autoRuntime");

// Load the model in JSON format
const fs = require("fs");
let rawdata = fs.readFileSync("./bank_marketing_model.json");
let modelDefinition = JSON.parse(rawdata);

// Create scoring engine based on the model's JSON format
autoEngine = runtime.createEngine(modelDefinition);

// New observation to score
row = [
  {
    variable: "AGE",
    value: 40,
  },
  {
    variable: "JOB",
    value: "entrepreneur",
  },
  {
    variable: "MARITAL",
    value: "married",
  },
];

// Score new observation
var t0 = new Date().getTime(); // timer start
const prediction = autoEngine.getScore(row);
var t1 = new Date().getTime(); // timer end
console.log("Prediction: " + prediction["proba"]);
console.log("Inference took " + (t1 - t0) + " milliseconds.");

 

And execute the file.

!node score_json_light.js

 

The JavaScript needed just 22 milliseconds to create the prediction.

 

Even though the code was executed within a Jupyter Notebook, the prediction was carried out purely in JavaScript, without any need to be online or connected to any other infrastructure.

Try it out in your preferred JavaScript environment. Here is an example to score new predictions in Node.js. Copy the files bank_marketing_model.json, score_json_light.js and the lib folder into an empty folder anywhere on your computer.

Install the amdefine package.

npm install amdefine

 

And obtain the prediction.

node score_json_light.js

 

In pure Node.js, without the Jupyter Notebook framework, the same prediction took only 17 milliseconds.

Next steps

You might already have ideas, how this JavaScript scoring can be used in a business process. Stojan Maleschlijski gives an excellent and very comprehensive example in his blog, about a project which used this concept to improve b-2-c Marketing communications: MLOps in practice: Applying and updating Machine Learning models in real-time and at scale

At Teched 2020 Stojan Maleschlijski and Andreas Forster presented and demoed this project, how the JavaScript scoring can be embedded with SAP Data Intelligence and Kafka to personalise a website for more targeted Marketing. The recording is available on Youtube.

Just let us know if you have any questions!
Marc DANIAU , Andreas Forster

Assigned tags

      1 Comment
      You must be Logged on to comment or reply to a post.
      Author's profile photo Dirk Kemper
      Dirk Kemper

      Hi Andreas, Marc,

      thank you for providing this great alternative to in-database model apply using APL. This was exactly the solution to a problem I was facing on a project recently. The Javascript engine brought down model apply time from several hours (for 3.5 million apply actions) to several minutes. Multithreading within NodeJS is also very helpful in this regard.

      I may be worthwhile to point out that the EXPORT_APPLY_CODE can also be called from SQLScript directly if you are not using the hanaml Python interface.

      Regards, Dirk