Skip to Content

This blog page discusses how to use a PMML model exported from SAP Predictive Analysis or from some PAL functions via a stored procedure generated by afl_wrapper_generator for a PAL function. We will use multiple linear regression (MLR) as the driver for this topic.

Walk Through

It does appear that some PAL functions can export PMML according to SAP HANA Predictive Analysis Library PAL Reference.  Using this reference, we can see that SAP Predictive Analysis is using LRREGRESSION as the PAL function to train MLR against. Then when we export the model as a stored procedure, we get a wrapper stored procedure that calls the FORECASTWITHLR generated stored procedure (ie, maybe called something like _SYS_AFL.I838604_PAS00AMYWGCT0Y_ZE4LISJ2MWMY_MLR_PMML_FORECASTWITHLR). By default, the stored procedure that is generated is set up to deal with a coefficient input table; however, we can see from the aforementioned reference, that there is an alternative option:

/wp-content/uploads/2014/02/pal_input_table_398092.png

From the above picture, we can see that second input table for FORECASTWITHLR can take a PMML model as an input for the coefficient input table. Also, the third input table, would need to be changed to set the MODEL_FORMAT row equal to “1” to tell it to use a PMML model.

/wp-content/uploads/2014/02/pal_mlr_parameter_input_table_398102.png

These differences mean we need to regenerate the stored procedure to the PAL FORECASTWITHLR function unfortunately. Thus, this means using the stored procedure generated by SAP Predictive Analysis will do us no good if we want to use a PMML model. In other words, we must manually run SQL to get to the point of using afl_wrapper_generator.

The following is the SQL that I have used to test this, which is mostly self contained. Make sure your user has appropriate privileges (ie, see Error: Insufficient privilege). One of the main changes to use the PMML is to change the second input table’s second column to a VARCHAR (5000) column (a CLOB is an option as well, though SAP Predictive Analysis has an NCLOB used by the output table type).


/*
 * Create wrapper procedure for PAL's FORECASTWITHLR function.
 */
/* CREATE TABLE TYPE FOR MY INPUT DATA */
DROP TYPE PAL_DATA_INP_T;
CREATE TYPE PAL_DATA_INP_T AS TABLE(
"row_id" INT,
"CONSENSUS_QTY_DBL" DOUBLE
);
/* CREATE TABLE TYPE FOR COEFFICIENT INPUT */
DROP TYPE PAL_COEFFICIENT_INP_T;
CREATE TYPE PAL_COEFFICIENT_INP_T AS TABLE(
"row_id" INT,
"Pmml" VARCHAR (5000)
);

/* CREATE TABLE TYPE FOR THE TABLE THAT WILL CONTAIN THE INPUT PARAMETERS */
DROP TYPE PAL_CONTROL_T;
CREATE TYPE PAL_CONTROL_T AS TABLE(
"Name" VARCHAR (50),
"intArgs" INTEGER,
"doubleArgs" DOUBLE,
"strArgs" VARCHAR (100)
);
/* CREATE TABLE TYPE FOR THE OUTPUT TABLE */
DROP TYPE PAL_RESULT_T;
CREATE TYPE PAL_RESULT_T AS TABLE(
"row_id" INTEGER,
"PredictedValues" DOUBLE
);

/* CREATE TABLE THAT WILL POINT TO THE DIFFERENT TYPES I'M USING TO RUN THE ALGORITHM */
DROP TABLE PDATA;
CREATE COLUMN TABLE PDATA(
"ID" INT,
"TYPENAME" VARCHAR(100),
"DIRECTION" VARCHAR(100) );

/* FILL THE TABLE */
INSERT INTO PDATA VALUES (1, 'I838604.PAL_DATA_INP_T', 'in');
INSERT INTO PDATA VALUES (2, 'I838604.PAL_COEFFICIENT_INP_T', 'in');
INSERT INTO PDATA VALUES (3, 'I838604.PAL_CONTROL_T', 'in');
INSERT INTO PDATA VALUES (4, 'I838604.PAL_RESULT_T', 'out');
/* GENERATE THE Multiple Linear Regression PROCEDURE */
/* Grant SELECT */
GRANT SELECT ON I838604.PDATA TO SYSTEM;
/* Generate PROCEDURE */
call SYSTEM.afl_wrapper_eraser('PAL_MLR_PMML_PROC');
call SYSTEM.afl_wrapper_generator('PAL_MLR_PMML_PROC', 'AFLPAL', 'FORECASTWITHLR', PDATA);
--------------------------------------------------------------------------------------
/*
 * Use the newly created stored procedure
 */
DROP TABLE coef_input_table_1;
CREATE COLUMN TABLE coef_input_table_1("row_id" Integer, "Pmml" VARCHAR (5000));
insert into coef_input_table_1 values (0, null);
update coef_input_table_1 set "Pmml" =
'<PMML version="4.0" xmlns="http://www.dmg.org/PMML-4_0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >
<Header copyright="SAP" >
<Application name="PAL" version="1.0" />
</Header>
<DataDictionary numberOfFields="2" >
<DataField name="SALES_FORECAST_QTY_DBL" optype="continuous" dataType="double" />
<DataField name="CONSENSUS_QTY_DBL" optype="continuous" dataType="double" />
</DataDictionary>
<RegressionModel modelName="Instance for regression" functionName="regression" algorithmName="LinearRegression" targetFieldName="SALES_FORECAST_QTY_DBL" >
<MiningSchema>
<MiningField name="SALES_FORECAST_QTY_DBL" usageType="predicted" />
<MiningField name="CONSENSUS_QTY_DBL" usageType="active" />
</MiningSchema>
<ModelExplanation>
<PredictiveModelQuality targetField="SALES_FORECAST_QTY_DBL" dataUsage="training" r-squared="0.00788359" >
</PredictiveModelQuality>
</ModelExplanation>
<RegressionTable intercept="1329.88">
<NumericPredictor name="CONSENSUS_QTY_DBL" exponent="1" coefficient="-0.0834675"/>
</RegressionTable>
</RegressionModel>
</PMML>'
;
select * from coef_input_table_1;
drop table control_input_table_1;
CREATE COLUMN TABLE control_input_table_1("Name" VARCHAR (50),
"intArgs" INTEGER,
"doubleArgs" DOUBLE,
"strArgs" VARCHAR (100)
);
insert into control_input_table_1 ("Name", "intArgs") values ('THREAD_NUMBER', 1);
insert into control_input_table_1 ("Name", "intArgs") values ('MODEL_FORMAT', 1); -- set to 1 for PMML
select * from control_input_table_1;

/* CREATE TABLE TYPE FOR MY INPUT DATA */
DROP TYPE PAL_WRAPPER_DATA_INP_T;
CREATE TYPE PAL_WRAPPER_DATA_INP_T AS TABLE(
"CONSENSUS_QTY_DBL" DOUBLE
);
/* CREATE TABLE TYPE FOR THE OUTPUT TABLE */
DROP TYPE PAL_WRAPPER_RESULT_T;
CREATE TYPE PAL_WRAPPER_RESULT_T AS TABLE(
"CONSENSUS_QTY_DBL" DOUBLE,
"PredictedValues" DOUBLE
);

drop PROCEDURE "I838604"."MY_PAL_MLR_PMML_WRAPPER";
CREATE PROCEDURE "I838604"."MY_PAL_MLR_PMML_WRAPPER"(IN data "PAL_WRAPPER_DATA_INP_T", OUT result "PAL_WRAPPER_RESULT_T")
 READS SQL DATA AS BEGIN
 data_inp = CE_PROJECTION(:data,[CE_CALC('rownum()', INTEGER) as "row_id","CONSENSUS_QTY_DBL"]);
 model_tab = CE_COLUMN_TABLE(coef_input_table_1);
 control_tab = CE_COLUMN_TABLE(control_input_table_1);
 call _SYS_AFL.PAL_MLR_PMML_PROC(:data_inp,:model_tab,:control_tab,result_pred);
 result = select "CONSENSUS_QTY_DBL","B"."PredictedValues" from :data_inp as "A" LEFT JOIN :result_pred AS "B" ON "A"."row_id" = "B"."row_id";
END
DROP TABLE my_test_input;
CREATE COLUMN TABLE my_test_input AS ( select * from "SAPSOPG"."pdmpoc::T66_DATA_SORTED" );
DROP TABLE #wrapper_output_table;
CREATE LOCAL TEMPORARY COLUMN TABLE #wrapper_output_table(CONSENSUS_QTY_DBL Double, PredictedValues Double);

-- Call Stored Procedure Wrapper for the 'FORECASTWITHLR' PAL function stored procedure
call "I838604"."MY_PAL_MLR_PMML_WRAPPER"(my_test_input, #wrapper_output_table) with OVERVIEW;
select * from #wrapper_output_table;

In short, with the above SQL script, you can modify certain parts of the PMML model (which is XML) to see the results table change:


update coef_input_table_1 set "Pmml" =
'<PMML version="4.0" xmlns="http://www.dmg.org/PMML-4_0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >
<Header copyright="SAP" >
<Application name="PAL" version="1.0" />
</Header>
<DataDictionary numberOfFields="2" >
<DataField name="SALES_FORECAST_QTY_DBL" optype="continuous" dataType="double" />
<DataField name="CONSENSUS_QTY_DBL" optype="continuous" dataType="double" />
</DataDictionary>
<RegressionModel modelName="Instance for regression" functionName="regression" algorithmName="LinearRegression" targetFieldName="SALES_FORECAST_QTY_DBL" >
<MiningSchema>
<MiningField name="SALES_FORECAST_QTY_DBL" usageType="predicted" />
<MiningField name="CONSENSUS_QTY_DBL" usageType="active" />
</MiningSchema>
<ModelExplanation>
<PredictiveModelQuality targetField="SALES_FORECAST_QTY_DBL" dataUsage="training" r-squared="0.00788359" >
</PredictiveModelQuality>
</ModelExplanation>
<RegressionTable intercept="1329.88">
<NumericPredictor name="CONSENSUS_QTY_DBL" exponent="1" coefficient="-0.0834675"/>
</RegressionTable>
</RegressionModel>
</PMML>'
;





So, running the above script as is, would generate output like the following:

/wp-content/uploads/2014/02/trained_output_results_398103.png

Changing the coefficient to coefficient=“-0.1834675”, yields:

/wp-content/uploads/2014/02/modified_pmml_output_results_398104.png

Functions with PMML Export Option  → Import Option

  • GEOREGRESSION → FORECASTWITHGEOR
  • LNREGRESSION → FORECASTWITHLNR
  • CREATEDT → PREDICTWITHDT
  • CREATEDTWITHCHAID → PREDICTWITHDT
  • EXPREGRESSION → FORECASTWITHEXPR
  • LOGISTICREGRESSION
  • LRREGRESSION → FORECASTWITHLR
  • POLYNOMIALREGRESSION → FORECASTWITHPOLYNOMIALR
  • APRIORIRULE
  • LITEAPRIORIRULE

Functions without PMML EXport

  • KNN
  • NBCTRAIN
  • NBCPREDICT
  • SINGLESMOOTH
  • DOUBLESMOOTH
  • TRIPLESMOOTH

Conclusions

In conclusion, we can see from this blog page that it is possible to use a PMML model against a PAL function. There are some issues however. We would have to regenerate the wrapper stored procedure to the PAL function, so that it knows to use a PMML model and not a straight coefficient (Ai). So, this means we probably wouldn’t want to export from SAP Predictive Analysis as a stored procedure if we want to use PMML. We can export as PMML and then we would need to have a script, similar to the one included on this page, to run the afl_wrapper_generator against. We then would need to copy and paste the PMML into the script and make any desired changes to the XML (before or after Copy + Paste). This could be a brittle solution. Of course, exporting a stored procedure from SAP PA is probably not the best option if you are expecting to have multiple end users which may or may not be on the same HANA instances. Using the PMML also self contains the settings needed; however, it may be nice to have a GUI to import, validate and update the database with PMML changes.

References

Points of Contact

To report this post you need to login first.

1 Comment

You must be Logged on to comment or reply to a post.

  1. Ingo Peter

    Hi Kevin,

    Is the other way round possible?: I got the pmml model for a decision tree (PAL function CART) as a result in HANA. Now I’d like to import it into Predictive Analysis in order to visualize it.

    Is it possible to import pmml models to PA?

    Thanks,

    Ingo

    (0) 

Leave a Reply