Skip to Content
Technical Articles
Author's profile photo Ankit Soni

Integrate Large Data Sets from Flat File to SAP HANA Cloud using SAP IS JDBC Adapter


This blog post details a scenario in which data needs to be loaded from files from a demand and supply planning system into SAP HANA Cloud DB to build BI reports. The integration process should achieve the following high-level requirements:

  1. A tab-delimited file with close to 2 million rows will be received via SFTP from the source system.
  2. The staging table in SAP HANA Cloud DB should be truncated before loading the data.
  3. The data should be loaded from the Tab-delimited flat file into the HANA table.
  4. After the data is loaded, a stored procedure call should be initiated to take a snapshot of the data into another table that will be used for reporting.

While designing this integration process, there are a few things to consider. First, since there is a high number of rows in the incoming file, it is essential to split them into batches and run them in parallel to avoid the DB insert through the JDBC receiver from timing out or taking too long to respond. Second, the procedure call should be serialized and triggered only once all the data in the table is loaded. Finally, if the process takes longer than 15 minutes, the default lock timeout for the sender SFTP must be increased to avoid a duplicate run.


  1. Create a JDBC user in SAP HANA Cloud through SAP HANA Cockpit -> User & Role Management -> User Management
  2. Grant required roles for schema under which tables exist and Execute object privileges for the stored procedure also through SAP HANA Cockpit -> User & Role Management -> Privilege Management
  3. Create JDBC data source using JDBC user created above in SAP Integration Suite -> Manage Security -> JDBC MaterialJDBC URL: jdbc:sap://<SAP HANA Cloud host>:443/?encrypt=trueRefer – JDBC for SAP HANA Cloud | SAP Help Portal (Do not pass the databaseName property to be able to successfully connect to SAP HANA Cloud). You can refer to JDBC Connection Properties | SAP Help Portal for a full set of JDBC connection properties

Integration Process Flow:

Integration Process

  1. SFTP sender channel polls for the file and picks it up for processing when available.
  2. Content Modifier: backs up the incoming file name to be retrieved later in case of any exception.
  3. Parallel Multicast: used to archive the incoming file on the internal server in one branch and to load the data to HANA in another branch.
  4. Content Modifier: generates SQL for truncating the staging table and backs up the incoming dataset in the property variable to be retrieved later.
  5. Request-Reply: calls SAP HANA Cloud using JDBC receiver adapter for executing the SQL query constructed in the Content Modifier.
  6. Content Modifier: sets a dummy payload to create two serialized processes once a success message is received from the JDBC call
  7. Iterating Splitter: configured with Line Break and parallel processing turned OFF to create two serialized sub-processes
  8. Content Modifier: creates a property for the body of the splitter dummy payload in the DBInsert or ProcedureCall branch. The reason for storing it in the property is because Router doesn’t allow to use of ${in.body} expression in the routing condition is required to branch both serialized processes to different branches.
  9. Router: branch based on the between DBInsert & ProcedureCall subprocess.
  10. Content Modifier: resets the original incoming file data (tab-delimited) from the property variable which was backed up in earlier steps.
  11. General Splitter: splits the rows in tab files into multiple groups of 10k records, which can be executed in parallel using the maximum concurrent processes supported (50)
  12. CSV to XML converter: converts data in each split process into XML, which is used to map to insert structure for the database.
  13. Message Mapping: Message mapping is used to map the incoming data into the structure on the receiver side for data insert. Refer to SAP Help Batch Payload and Operation | SAP Help Portal
  14. JDBC receiver: This is configured in batch mode to insert multiple records in a single call. Once all the records are updated in the table from all the split concurrent processes. Then the second branch for procedure call is started.
  15. Content Modifier in branch 2(Procedure Call): setting the body with a stored procedure call which needs to be executed after the table load is complete.
  16. Request-Reply: JDBC receiver adapter sends the procedure call to SAP HANA Cloud.
  17. Send (branch 2 in Parallel Multicast): used to archive the incoming file on an external SFTP server locally using an SFTP receiver adapter.


I opted to use a Splitter and Router combination instead of sequential multicast to ensure that the split process call is completed before the next call, instead of just ensuring that the step is executed first without waiting for it to complete.

In my testing on our SAP IS tenant, the integration flow takes about 5 minutes to finish for an incoming file with approximately 1.7 million rows, which I consider to be decent performance.

Hopefully, this blog will help SAP IS developers & architects in designing integration solutions for loading data to SAP HANA Cloud, even for larger data sets.

For further learning on using the JDBC adapter with MS SQL server, refer to the blog post at

Additionally, another blog post on SAP HANA as a service is available for reference, although please note that the database name is no longer required in the JDBC connection property for SAP HANA Cloud, and the connection will fail if it is specified. This post can be found at

Furthermore, there are alternative approaches to loading data from files to SAP HANA Cloud including the SAP SDI File adapter. Refer SDI Configuration: Connect Data Provision Agent(SDI) to Cloud Foundry environment and enable the Adapters(File Adapter, ABAP Adapter and HANA Adapter) | SAP Blogs

Assigned Tags

      You must be Logged on to comment or reply to a post.
      Author's profile photo Gautam Purohit
      Gautam Purohit

      that's an excellent write-up Ankit Soni

      will definitely be useful for many integration colleagues working with high volume JDBC scenarios.




      Author's profile photo Ankit Soni
      Ankit Soni
      Blog Post Author

      Hi Gautam,

      Thank you for taking the time to share your positive feedback. I'm glad to know that you found the write-up to be useful for colleagues working with high-volume JDBC scenarios.



      Author's profile photo Saurabh Kabra
      Saurabh Kabra

      Hi Ankit,


      Nice blog with a very precise explanation about each step in iflow.

      I have a question though... Why did you take a snapshot of the table inserted on the last step(branch 2 --> procedure call) to another table? Why cannot you report based on the inserted record for the source table itself in HANA Cloud? Any limitation on HANA Cloud or was that a business requirement to not expose the source table for reporting?




      Author's profile photo Ankit Soni
      Ankit Soni
      Blog Post Author

      Hi Saurabh,

      Thanks for the feedback. I don't think there is any restriction to using the same table for reporting. However, as this is more of demand planning/forecast data, the procedure also provides flexibility to add additional aggregate measures required to report how the forecast changes every week.

      Additionally, in our case, there were power BI reports that currently exist and this approach provided us the flexibility to switch to better forecast data from the new planning tool from excel based forecast models without impacting current reports.



      Author's profile photo Nicolas Binet
      Nicolas Binet

      Hi Ankit

      Thank you for this interesting article.

      One thing I don't fully get:

      What is the advantage of the splitter vs. a sequental multicast in this case? You write:

      "I opted to use a Splitter and Router combination instead of sequential multicast to ensure that the split process call is completed before the next call, instead of just ensuring that the step is executed first without waiting for it to complete."

      But using a sequential multicast, it would also wait for the first branch to complete.

      Do I miss something?


      Thank you


      Author's profile photo Ankit Soni
      Ankit Soni
      Blog Post Author

      Hi Nicolas,

      Thanks for taking the time to read and provide feedback. In my experience, I have noticed that sequential multicast doesn't wait for a branch to finish completely if the branch takes a long time.

      I am not particularly sure about the reason behind it but my recon is Apache camel Multicast component has a timeout option as well. You can read about it here: Multicast :: Apache Camel

      Sets a total timeout specified in millis. If the Multicast hasn't been able to send and process all replies within the given timeframe, then the timeout triggers and the Multicast breaks out and continues. Notice if you provide a TimeoutAwareAggregationStrategy then the timeout method is invoked before breaking out.

      Currently sequential multicast component in CPI doesn't have any option to control the timeout settings. Also, I couldn't find a reference to the default timeout for sequential multicast in CPI documentation. Hence, I opted for the above approach to not have issues if the data load branch takes a longer time when the record count increases further.