Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
Ian_Henry
Product and Topic Expert
Product and Topic Expert
In this blog post I will share my experiences developing a fully connected RESTful web service.

A common request I receive, is asking how to expose SAP data as an API. With SAP there's always multiple ways to achieve things.  Here I have used SAP Data Intelligence and the OpenAPI, operator, connected with live data and processed by an open source engine.

Similar techniques can be applied to the wide variety of sources that SAP Data Intelligence supports, we could be combining data from S4/HANA, SAP BW, SAP ECC, cloud storage, database or data lakes.

Use Case


A number of customers have asked me how best to provide and process SAP data in an interactive way.

Meaning they wanted to connect with SAP data, set parameters and enrich the output with some intelligence.  The output could then be consumed via a web app or a native (iOS or Android) mobile app.

Just to be clear, if the requirement is expose a single source with no further processing then SAP Data Intelligence would not be needed, SAP Gateway could more suitable.

This scenario required data to be retrieved from the SAP system, processed by another operator, such as R, Python, Node, Go or other dockers with C++. The results would then exposed as an API.

Initially I thought how that  Figure 1 represents this concept.



Next step was to colour in the boxes, to understand what I could implement.



 

SAP Data Intelligence Implementation


Initially I came up with a pipeline that reflected Figure 2, you may think that my work was done here.  Actually it was just beginning.



If you look very closely at the arrows in the above diagram you can see they only flow one way.  This is from left to right, meaning the data only moves this direction.  The pipeline would therefore not receive input from the OpenAPI Servlow, any user parameters would therefore be ignored. Incoming requests would only reach the OpenAPI Servlow, and the response would be static.

Previously I had read jens.rannacher  great blog post on Building a RESTful Service with SAP Data Hub Pipelines, which I re-read to help get me started.

Using the example pipeline referenced there, OpenAPI Greeter Demo.  We can see a loop where the request is fed back into the server to create the response.



This example pipeline can be immediately executed, Jen's describes how to call this with Postman and curl.

It's a good start, but clearly it is not reflecting Figure 1, there is no external systems and no processing with Python or R.

  • SAP System = SAP HANA

  • Processing = R Client

  • OpenAPI Server = OpenAPI Servlow

  • Client = Postman

  • Controller = JavaScript


I took this pipeline and added Wiretaps to all ports so that I could understand exactly what is being passed where.  I deliberately numbered the Wiretaps, in my expected sequence, this makes tracing more intuitive. I added the SAP System (SAP HANA Client) to provide an interactive data source. My idea was, the HANA Client will receive a SQL statement that has been dynamically generated by the Controller.



After inspecting the Wiretaps, I learnt how important the port type, message is.  The message allows separation of data (body) from metadata (header).  Notice above how all output ports and connections are pink, meaning they are all messages. This allows the message header information to transfer between operators. The OpenAPI operator requires this header information.

The message header, shows the following.
{"message.commit.token":"bced6483-2c2a-45bf-b351-7d1f2b94f89f",
"message.request.id":"bced6483-2c2a-45bf-b351-7d1f2b94f89f",
"openapi.graph.id":"bee2ab56e9a64d46904e238a264916bc",
"openapi.header.content-type":"application/json",
"openapi.host":"100.64.177.110:8090",
"openapi.method":"POST",
"openapi.remote_addr":"123.123.123.123",
"openapi.request_uri":"/dev/v1/hana",
"openapi.scheme":"https",
"openapi.subgraph.id":"default"}

The OpenAPI Servlow uses the "message.request.id" to uniquely identify each request.

Simple routing of requests is performed by the JavaScript Controller.  I have kept the /ping request for convenience of testing.

The code below receives a JSON input {"Rows":"2"} and in turn returns that number of rows from the source system (SAP HANA).
$.setPortCallback("input",onInput);
$.setPortCallback("hanain",hanaInput);

// ping count
var count = 0;

function isByteArray(data) {
return (typeof data === 'object' && Array.isArray(data)
&& data.length > 0 && typeof data[0] === 'number')
}

function hanaInput(ctx, m) {
// directly send the message from HANA as Response
sendResponse(m, m, null);
}

function sendResponse(s, m, e) {
if ($.output == null) {
// invoke the callback directly
$.sendResponse(s, m, e);
} else {
// let the subsequent operator decide what to do
if (e !== null) {
m.Attributes["message.response.error"] = e;
}
$.output(m);
}
}

function onInput(ctx,s) {
var msg = {};
var inbody = s.Body;
var inattributes = s.Attributes;

// convert the body into string if it is bytes
if (isByteArray(inbody)) {
inbody = String.fromCharCode.apply(null, inbody);
}

// response message
msg.Attributes = {};
for (var key in inattributes) {
// only copy the headers that won't interfere with the recieving operators
if (key.indexOf("openapi.header") < 0 || key === "openapi.header.x-request-key") {
msg.Attributes[key] = inattributes[key];
}
}

// get the request path
var reqpath = inattributes["openapi.request_uri"];
// set header content-type
msg.Attributes["openapi.header.content-type"] = "application/json";

switch (reqpath) {
case "/dev/v1/ping":
msg.Body = {"pong": count++};
sendResponse(s, msg, null);
break;
case "/dev/v1/hana":
data = JSON.parse(inbody);
// check Rows has been received
if ((data["Rows"]) === undefined) {
sendResponse(s, msg, Error("Invalid Input, please specify Rows"));
}
// check if the value Rows is not a number
else if ((isNaN(data["Rows"]) === true)) {
sendResponse(s, msg, Error("Invalid Number "+ data["Rows"].toString()));
}
else {
// build SQL statement
sqlstatement = "SELECT TOP " + data["Rows"].toString() + " * FROM HDB.XRATE_GBP_EUR";
msg.Body = sqlstatement
// send message to sql port for HANA
$.sql(msg);
}
break;
default:
sendResponse(s, msg, Error("Unexpected operation at " + reqpath));
break;
}
}

I tested the pipeline shown in Figure 5 with Postman to verify it behaves as expected.

The next challenge is to pass this data through R, actually this is easy, but retaining the message headers was more challenging.  The R operator needs to be configured to receive a message as input and send a message as output.  To understand how R is working in Data Intelligence I added a number print(paste()) statements to my R code.  The output of these statements can be found in the DI Debug trace to verify what is happening.



Reviewing the R Client Operator I learnt how the native Data Intelligence Go message is interpreted by R.  The R representation a list, with the following fields: Body, Attributes and Encoding.
msg2df <- function(msg) {
print(paste("msg2df Body Names ", names(msg$Body)))
print(paste("msg2df Body Dollar ", msg$Body))
body = msg$Body
df=data.frame(t(sapply(body,c)))
print(paste("msg2df Body DataFrame", df))
return (df)
}

msgheader <- function(msg) {
print(paste("msg2hed Attributes Names ", names(msg$Attributes)))
print(paste("msg2hed Attributes Dollar ", msg$Attributes))
header = list(msg$Attributes)
}

df2msg <- function(df, header) {
attributes = header[[1]]
print(paste("df2msg Attributes ", attributes))
print(paste("df2msg Attributes Names ", names(attributes)))
dfaslist <- setNames(split(df, seq(nrow(df))), rownames(df))
msg <- list(Body=list(dfaslist), Attributes=attributes, Encoding="bar")
}

onInput <- function(msg) {
df <- msg2df(msg)
header <- msgheader(msg)
outMsg <- df2msg(df,header)
print(paste("Output Print ", outMsg))
list(outData=outMsg)
}

api$setPortCallback(c("inData"), c("outData"), "onInput")

With the wiretaps removed, the pipeline looks a lot more simple, but you loose the ability to verify what the internals are doing.


Performance Benchmarking using Command Line


After multiple tests with Postman, I wanted to validate the performance and scalability of the pipeline using other methods.

I tried 3 command line utilities, and all worked as expected.

  • vctl

  • curl

  • ab


gianlucadelorenzo published a great blog on vctl.

vctl is a Data Intelligence System Management command line client, that used to support http calls. Http is officially deprecated, but still works.
 vctl util http /app/pipeline-modeler/openapi/service/dev/v1/hana \
-H Content-Type=application/json \
-d '{"Rows":2}' \
-X POST

 

To use curl we can  export the required command from postman, using code, export.
curl -X POST \
https://<SAP-Data-Intelligence-Cluster.com>/app/pipeline-modeler/openapi/service/dev/v1/hana \
-H 'Authorization: Basic ZGVmYXVsdFx3ZGYwMTpXZWxjb21lMDE=' \
-H 'Content-Type: application/json' \
-H 'X-Requested-With: XMLHttpRequest' \
-H 'cache-control: no-cache' \
-d '{"Rows":"3"}'

Using Apache Bench (ab) has multiple options including number of requests and concurrency. It also provdes statistics and a summary of response times.. Googling for ab and performance was the most difficult part.
ab -n 20 -c4 -s 10 -v3 \
-H 'Authorization: Basic ZGVmYXVsdFx3ZGYwMTpXZWxjb21lMDE=' \
-H 'X-Requested-With: XMLHttpRequest' \
-T 'application/json' \
-H 'cache-control: no-cache' \
-p rows.json \
https://<SAP-Data-Intelligence-Cluster.com>/app/pipeline-modeler/openapi/service/dev/v1/hana

Asking for the verbose output from (-v) displays the complete information including the headers, one of which, Server-Timing gives us the vflow(pipeline execution time).
Server-Timing →vflow;dur=90.69

 

Conclusion


In this blog posted, I described how I processed SAP data via open source R and exposed it via a Restful webservice.

Having connected data is now a common requirement coming from applications and mobile apps.  These data driven apps often require data surfaced via APIs.  SAP Data Intelligence provides a relatively easy way to implement such APIs.
4 Comments