Skip to Content
Technical Articles

Push Data from SAP Cloud Foundry to Data Hub or Data Intelligence

Introduction:

In my previous Blog , I  covered the topic of exposing On Premise Data to SAP Cloud Foundry using a Python based application with Flask Library and no authentication .

Our primary target is to expose data which is in an on-premise environment to an open internet environment . Be it Data Hub instance hosted over a Kubernetes service running over a cloud provider which is not on the same network as the on-premise system or a Data Intelligence suite hosted over SCP CF instance .

In case you haven’t checked my previous blog , please have a look at the link here since this is the base of our next step .

In case you already have data over SCP , lets move to our next step .

Proxy APP

If you can recall from our previous blog , the APP hosted over SAP cloud foundry was a python app , which has our raw data . If you check the details of this APP , it is hosted over the HTTPs protocol .

Please note unless explicitly mentioned the port for an HTTPs application would be 443 .

Now comes the host name for this application. The host name of this app would be the whole URL of this app without the HTTPS:// and the / in the end .

I know , to much of text is boring and not self explanatory. Let’s take a look at one example .

The APP from my previous blog would look like the below screenshot when i navigate inside my space ->application->application name(no auth in this case) .

 

Now if you see this app would have a URL like below :

https://noauth-fluent-leopard.cfapps.eu10.hana.ondemand.com/

 

Of course as the name suggests , this is an application with no authorization and basically anyone who has access over SAP CF and internet can access it , so by the point of time you would have reached reading this line or this thought would have crossed your mind , this app would have lived its life and not accessible on the open internet anymore 😉 .

But to give an example of host and port from this example URL :

Host would be : noauth-fluent-leopard.cfapps.eu10.hana.ondemand.com

Port would be :443

And that is all we need to push this data to Data Hub / Data intelligence .

Graph in Data Hub

The graph would need an operator called ‘HTTP Client’. As per the offical documentation of this operator , it is – An HTTP client operator capable of sending arbitrary HTTP requests, polling a URL and POSTing JSON data.

In our case since we are trying to Pull the data , this will be a Get request . The configuration of this Get request as per our example above would look like below :

Please note this will be a Get request and polling period has to be adjusted as per the need and design of the Proxy app .

The next step of the graph could be anything but for now lets write the output to a sample .txt file and see the output .

A sample graph should look like below :

 

Please note in this graph , since we pull data from the app , the data has to be converted to the type message , since this is the input type of the ‘write file’ operator . This topic is already taken care by a predefined operator called ‘ToMessage Converter’ , which has an input of type any and gives an output of type message .

The write operator will then dump the whole output to a dedicated file as per our needs and after the execution of the graph , the file can be seen from going to System management -Files tab and the specific location as given in the Write file operator .

Now lets look at the write file operator:

 

Output

Now lets save and execute this graph.

 

As you can see from the output , the graph was executed and the file was saved . Now lets go to the file location and check for a file named noauth.txt .

This should somewhat look like the above image . If you notice this is the output from our application which i had discussed in length in my previous blog.

The data has an exact metadata info and meets our end target of pushing the data from our source system on an on-premise system using a cloud connector to an application hosted on SAP Cloud Foundry and then this data is  pushed to Data Hub/ DI .

From an end to end perspective this is a mission accomplished but there are a lot of lags which we need to fix :

  • The application still doesn’t have authorization
  • The data format doesn’t make much sense , there is still a key to value reference as a JSON response . In case this data needs to be pushed to some other target , it still requires cleansing.
  • The scalability of the proxy app is yet not known and the limit of the number of records isn’t clear.

I will cover all these topics in my next blog to make this use-case more meaning full and complete.

Thanks for reading and i hope you have a great rest of the day ahead.

2 Comments
You must be Logged on to comment or reply to a post.
  • Hi,

     

    Great write up. This is very helpful. I would like to now, is it possible to just make a single GET request instead of polling the URL ?

    I would like to call my service endpoint, retrieve the response, store it in a file and stop the flow.

     

    Regards,

    Boudhayan Dev

  • Hi Boudhayan ,

     

    You can increase your polling time to make a complete get request to pull your whole data set based on the latency of your number of records from the Get request . Incase you are looking for a more generic solution , you can create a python operator with Request module to pull data as a get request to cater your specific needs . Hope this helps .

     

    Best Regards,

    Pranav