Skip to Content
Technical Articles

SAP Data Hub – Running a Java Application in a Data Pipeline

In this blog post, I will describe how to run a Java application using a Process Executor in SAP Data Hub Pipelines. This tutorial assumes that you have basic knowledge in using the SAP Data Hub Modeler. All code snippets can be found on Github.

Some Context…

 

SAP Data Hub Pipelines provide a very flexible technique for creating flow-based applications consisting of reusable and configurable Operators. Amongst the set of built-in Operators, there is a category of Processing Operators that allow executing programs written in different programming languages such as Javascript, Python, Go and R. However, it does not stop there. Applications written in programming languages that are not supported out of the box can also be executed using the Command Executor or the Process Executor operators:

 

  • The Command Executor executes a given command for each arrival of a message on the input port. The arriving message is provided on standard input. The standard output of the command is available on the output port after completion of the command. The Command Executor is suitable for lightweight application such as a ping, but not suited for rich applications with long start-up time since a process is started for every message arriving at the input port of the operator.

 

  • The Process Executor starts a process and provides given contiguous streams to it. The operator finishes when the forked process terminates. Data written to standard out or standard error are available on the resp. output ports. The Process Executor is suitable for rich applications such as a Java Application since the process is just started once and then each message is passed via standard input.

 

Let’s get started…

 

In this tutorial, I use a Java application as an example. The same concepts, however, could be applied to other types of applications as well. All you require is an executable that reads continuously from standard input and writes to stdout or stderr and a Dockerfile that describes the required runtime environment for the executable.

 

1. Create the Java Application

 

In the following, we create a simple Java application that continuously reads strings from stdin and calculates the number of characters in the string or transforms the string to uppercase letters depending on a configuration value that is passed via arguments. The result of each operation is written to stdout or to stderr in case of an error.

src/com/sap/javaapplication/Main.java:

package com.sap.javaapplication;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;

public class Main
{
    public static void main( String[] args )
    {
        InputStreamReader inputStreamReader = new InputStreamReader(System.in);
        BufferedReader bufferedReader = new BufferedReader(inputStreamReader);

        String line = null;
                       
        try {
            
            while ((line = bufferedReader.readLine()) != null ) {
                                
                if ( args.length == 0 ) {
                    System.err.println( "No Mode specified." );
                    continue;
                } 
                if ( line.length() == 0 ) { 
                    System.err.println( "Received empty value." );
                    continue;
                } 
                
                if ( args[0].toUpperCase().equals( "LENGTH" ) ) {
                    System.out.println( "String \"" + line + "\" has length " + line.length() );
                } else if ( args[0].toUpperCase().equals( "UPPER" ) ) {
                    System.out.println( line + " -> " + line.toUpperCase() );                
                } else {
                    System.err.println( "Mode " + args[0] + " is unknown.." );                            
                }
            }
    
        } catch (IOException e) {
            System.err.println(e.toString());
        }
    }
}

 

Build a jar with all dependencies, e.g. with Maven using the following pom.xml:

$ mvn clean package -f src/JavaApplication/pom.xml

 

2. Create a Dockerfile

 

Create a new Dockerfile by clicking on Repository > Create Docker File:

Define a Dockerfile that provides a Java Runtime Environment, e.g. by using an official Open JDK image as basis:

Dockerfile:

FROM openjdk:11-jre-slim

 

Tag the Dockerfile with “java” and version “11”.  The tags are chosen to latter match your Dockerfile when using the Dockerfile in an Operator:

 

3. Create a custom Process Executor Operator

 

Create a new custom Operator by clicking on Repository > Create Operator:

Define a Name and Display name of your choice and choose “Process Executor” as Base Operator:

Choose the previously defined Tag name and Tag version to ensure that the Operator is executed in a Docker container with a Java Runtime:

  • Add a new Parameter with name “mode” and default value “length”
  • Provide the command to start the Java application in the cmdLine parameter value and add the placeholder ${mode} as an argument. The placeholder will be replaced during runtime by the value specified for the parameter mode:java -jar /vrep/vflow/operators/examples/JavaProcessExecutor/JavaApplication.jar ${mode}

Scroll up to the Operator Configuration section and click on Auto Propose to generate a configuration for the operator from the available parameters:

Click the edit button next to the generated Operator Configuration:

Click on the mode parameter and configure it as follows:

  • Add the display Title “Mode”
  • Switch the mode to “Required”
  • Set Value Help to “Pre-defined Values”
  • Add two Values “Length” and “Upper” by entering the values and pressing Enter:

Click on OK.

Upload the compiled Java application via Upload Auxiliary File:

Choose the file by clicking on Browse… and click on Send afterward:

The file will be uploaded and placed in the Repository inside the folder of the Operator:

Click on Save to save the Operator:

 

4. Create a Graph using the Process Executor Operator

 

Create a new Graph:

Add and connect the following type of Operators:

  • Data Generator
  • Wiretap
  • StringToStream Converter
  • StreamToString Converter
  • JavaProcessExecutor

Modify the Script of the Data Generator Operator:

getRandomInt = function(min, max) {
    return Math.floor(Math.random() * (max - min + 1)) + min;
};

generateData = function()
{
    var payload = 'sometext';
    
    return payload;
};

$.addTimer("500ms",doTick);

function doTick(ctx) {
    $.output(generateData());
}

 

Run the pipeline and open the UI of the Wiretap operator connected indirectly to the stdout port of the JavaProcessExecutor:

The UI should show the string transformation of the Java application:

That’s it.

You can find the complete code example on Github https://github.com/SAP/datahub-integration-examples/tree/master/JavaProcessExecutor.

 

Be the first to leave a comment
You must be Logged on to comment or reply to a post.