Using Zones in ML Functional Service OCR API – sta...

former_member104848 · ‎03-10-2020

This blog post is a continuation of my previous blog post, where I described using zones with SAP ML Functional service OCR API for extracting data from documents in a meaningful way. As mentioned, these services are deprecated and will be decommissioned soon. They will be available now with the SAP Data Intelligence solution. The new Document Information Extraction service with SAP AI Business Service can be used for specific document types like invoice or payment advice. But for other kinds of templates we can recreate the same solution I described in part 1 with SAP Data Intelligence.

In this blog, I will take you through the details in 3 Steps. Let’s get started:

Use the OCR Functional Service inside Data Intelligence.

Call the same parser hosted on my SCP account within the pipeline I created in Data Intelligence

Expose as service

There are several excellent blogs which you can refer to understand Data Intelligence and also the the different features and applications in SAP Data Intelligence. I will straight away get into the details of the pipeline that I created using the ML Scenario Manager tile.

Lets take a look at how the final pipeline looks like

Step 1. OCR Functional Service inside Data Intelligence.

1. Function Service operator.

Drag and drop the Functional Services operator. Now this is the operator which gives us access to the pre-trained Machine Learning functional services, that we previously saw as APIs in SAP Machine Learning Foundation Functional Services on SAP Cloud Platform. Supported services with this operator are :

Supported services are:

Image Classification

Image Feature Extraction

Image OCR

Similarity Scoring

Text Classifier

Topic Detection

Translation

Of course you must know by now we will use the Image OCR. So, let' take a look at the configuration for this operator with the screenshot below

No points if you guessed, the operator has an input port to accept input parameters and output port for returning results.Both the input and the output port type in the case is of type message. A message type is akin to a request/response object and has a structure containing header and body. How we pass the input to the Functional Service we will see later in Step 3.4. Let's for a moment assume, the document has been passed to the service and we have the xml/hOCR output. Like in the part 1 of this blog, we will pass this output to the parser along with the zone information. (If you have not seen part 1 of the blog yet, you should do it now 😉 )

Step 2. Call the python parser hosted on SCP

1. Prepare request object using JS message operator

Let's drag and drop a JS Message operator, which again has input and output ports of type message. Rename it to let's say OCR parser. Connect the output of the Functional services operator to the input of the JS message operator.

We are using the JS operator to create the request object which will be passed as input to the parser API. We do have the option of using Node / Python etc as well to write the intermediate business/operation logic. In this case we are using the JS operator.

Click on

to edit the script for the JS operator

I'll add the code snippet here for reference.

$.setPortCallback("input",onInput);



function isByteArray(data) {

    switch (Object.prototype.toString.call(data)) {

        case "[object Int8Array]":

        case "[object Uint8Array]":

            return true;

        case "[object Array]":

        case "[object GoArray]":

            return data.length > 0 && typeof data[0] === 'number';

    }

    return false;

}



function onInput(ctx,s) {

    var msg = {};

    //zones of interest in the form of bounding boxes , hardcoded

    var sealantsBbox = {

				"regions": [{

					"id": "product",

					"key": {

						"id": "product_key",

						"boundingBox": {

							"posLeft": 279,

							"posTop": 1101,

							"posRight": 792,

							"posBottom": 1218

						}

					},

					"value": {

						"id": "product_value",

						"boundingBox": {

							"posLeft": 796,

							"posTop": 1103,

							"posRight": 1295,

							"posBottom": 1219

						}

					}

				}, {

					"id": "customer",

					"key": {

						"id": "customer_key",

						"boundingBox": {

							"posLeft": 1290,

							"posTop": 1102,

							"posRight": 1799,

							"posBottom": 1226

						}

					},

					"value": {

						"id": "customer_value",

						"boundingBox": {

							"posLeft": 1796,

							"posTop": 1105,

							"posRight": 2299,

							"posBottom": 1226

						}

					}

				},{

					"id": "productiondate",

					"key": {

						"id": "productiondate_key",

						"boundingBox": {

							"posLeft": 1295,

							"posTop": 1333,

							"posRight": 1795,

							"posBottom": 1446

						}

					},

					"value": {

						"id": "productiondate_value",

						"boundingBox": {

							"posLeft": 1816,

							"posTop": 1330,

							"posRight": 2291,

							"posBottom": 1464

						}

					}

				},{

					"id": "springball",

					"key": {

						"id": "springball_key",

						"boundingBox": {

							"posLeft": 289,

							"posTop": 1556,

							"posRight": 689,

							"posBottom": 1690

						}

					},

					"value": {

						"id": "springball_value",

						"boundingBox": {

							"posLeft": 698,

							"posTop": 1562,

							"posRight": 1095,

							"posBottom": 1678

						}

					}

				},{

					"id": "penetration",

					"key": {

						"id": "penetration_key",

						"boundingBox": {

							"posLeft": 290,

							"posTop": 1678,

							"posRight": 690,

							"posBottom": 1792

						}

					},

					"value": {

						"id": "penetration_value",

						"boundingBox": {

							"posLeft": 694,

							"posTop": 1671,

							"posRight": 1092,

							"posBottom": 1789

						}

					}

				},{

					"id": "cocflash",

					"key": {

						"id": "cocflash_key",

						"boundingBox": {

							"posLeft": 287,

							"posTop": 1791,

							"posRight": 683,

							"posBottom": 1908	

						}

					},

					"value": {

						"id": "cocflash_value",

						"boundingBox": {

							"posLeft": 694,

							"posTop": 1788,

							"posRight": 1098,

							"posBottom": 1909

						}

					}

				}]

			};



    var inbody = s.Body;

    var inattributes = s.Attributes;



    // convert the body into string if it is bytes

    if (isByteArray(inbody)) {

        inbody = String.fromCharCode.apply(null, inbody);

    }



    msg.Attributes = {};

    for (var key in inattributes) {

        msg.Attributes[key] = inattributes[key];

    }

    msg.Attributes["openapi.method"] = "POST";

    msg.Attributes["openapi.consumes"] = "multipart/form-data";

    msg.Attributes["openapi.produces"] = "application/json";

    msg.Attributes["openapi.header_params.X-Requested-With"] = "Fetch";

    

    // msg.Attributes["http.url"] = 'ocrparser.cfapps.us10.hana.ondemand.com';

    // msg.Attributes["http.method"] = 'POST';

    

    if (typeof inbody === 'string') {

        // if the body is a string (e.g., a plain text or json string),

        // uppercase the text and indicate it in attribute js.action.

        msg.Body = inbody.toUpperCase();

        msg.Attributes["js.action"] = "toupper";

    } else {

        // if the body is an object (e.g., a json object),

        // forward the body and indicate it in attribute js.action

        console.log("Ignore for " + (typeof inbody))

        msg.Body = {};

        // msg.Attributes["js.action"] = "noop";

        // if(inbody.Body.id !== undefined){

        //      msg.Attributes["id.found"] = "true";

        // }

        if(inbody.predictions !== undefined){

             msg.Body.bbox = JSON.stringify(sealantsBbox);

             msg.Body.hocr = JSON.stringify(inbody.predictions[0]);

        }

    }

    

    msg.Attributes['openapi.form_params.bbox'] = JSON.stringify(sealantsBbox);

    msg.Attributes['openapi.form_params.hocr'] = inbody.predictions[0];

    

    $.output(msg);

}

With the above code, we prepare a message type structure (request object) with the necessary headers and the request body containing multipart form post having the zone info(currently hard coded) and the hOCR output from the OCR service.

2. Make the HTTP request to the parser hosted on SCP

Let's drag and drop an Open API Client operator, which has input and output ports of type message. Connect the output of the OCR Parser JS operator to the input of the Open API Client input.

OpenAPI Client : This operator can be used to invoke services.

Here's the configuration for the Open API Client operator to call the parser

Step 3. Expose as service

1. Convert the response from the parser API call to string

Drag and drop the ToString Converter operator to convert the response from the OCR parser into string. Recall the response from the API call in the last step, provides data from the document in the defined zones. Connect the output of the Open API operator to the input of the ToString converter.

2. Prepare the response for the service

Drag and drop the Python3 operator and use it to create a response object of message type.

This operator does not have input and output ports by default. Click on

to add an input port of type message for body, input port of type message for headers and output port of type message.

.

.

Click on

to add logic to create the response.

Here is the actual code snippet

def on_input(headers, body):

    attributes = {}

    attributes['message.request.id'] = headers.attributes['message.request.id']

    result = api.Message(body.body, attributes)

    api.logger.info("Final Message: " + str(result))

    api.send("output", result)



api.set_port_callback(["headers", "body"], on_input)

3. Expose the whole pipeline execution as a service

Drag and drop an Open API Servflow operator. This operator can be used to provide services as a REST endpoint. The details of the ports for this operator:

Input port of type message called ret. This defines the response message produced by processing the request. This message should contain the original request ID attribute message.request.id. Note the py code in the last step and notice how we set the message.request.id.

Output port of type message. When a request is made to the end point of the service exposed using this operator, the request is used to generate an output message at this port.

Configure the service endpoint using OpenAPI/Swagger specification

Lets take a closer look at the swagger spec defined for this service

Note that the endpoint is served at a relative path of /getCoAData and takes a document as input

4. Pass the document received as input for this service to the Image OCR Functional Service

Recall at Step 1.1, we mentioned we will talk about how the input document is passed to the Image OCR service. Let's take a look at that now.

Drag and drop a 1:2 Multiplexer operator. This operator has one input and two output ports. The input is mapped to each of its output port.

Connect the output port of the OpenAPI Servflow to the input port of the 1:2 Multiplexer operator. And with that the request object of the service is applied to both the output ports.

4.Extract document received in request and pass it to the Image OCR Functional Service

Drag and drop the Python3 operator and connect one of the output ports from the multipexer to the input of the Python operator

Here is the code executed using the py operator

import io

import base64

def on_input(data):

    api.logger.info(data.attributes)

    

    attributes = {'storage.filename': data.attributes["openapi.file_params.files#name"]}

    

    content = data.attributes["openapi.file_params.files"]

    m = api.Message(base64.b64decode(content), attributes)

    

    api.send("output", m)



api.set_port_callback("request", on_input)

4.Provide header attributes to the service exposed using Open API Servflow

Lastly just drag the second output port of the multiplexer to the header input port we created in Step 3.2

Lets deploy this pipeline and view the API specification

Save the pipeline and deploy.

Lets look at the service spec after deployment. Click on the 'Open UI' button highlighted below.

Note the URL is https://<DI URL>/app/pipeline-modeler/openapi/service/<deployment>/swagger.json

Finally lets execute this service in postman and check the results

Authentication to this service needs a Bearer token.The token can be obtained by calling the following URL with basic authentication:

https://<host>:<port>/auth/login

Bearer token can be obtained from the response headers

2. Call the service using Postman and view the results

https://<DI URL>/app/pipeline-modeler/openapi/service/<deployment>/<path>

https://<DI URL>/app/pipeline-modeler/openapi/service/zoneocr/getCoAData

Summary

This brings us to the end of the blog. We saw how we consumed the Image OCR Functional service in SAP Data Intelligence, added some custom parsing and exposed a service. Cheers!