Reducing Runtime of CPI-DS Tasks by Avoiding Self ...

Technology Blogs by Members

Explore a vibrant mix of technical expertise, industry insights, and tech buzz in member blogs covering SAP products, technology, and events. Get in the mix!

Hello everyone,

I have been working in CPI-DS team for more than a year. My company is one of the renowned names in oil and gas industry and we use SAP-IBP for Sales and Operations Planning with CPI-DS being used for integrating data from SAP HANA and legacy systems. My CPI-DS team comprises of experienced CPI-DS professionals with average work experience of more than 10 years.

When I joined the team, I got to learn a lot about technical aspects of running CPI-DS but, I found that joins were extensively used even in applications where it can be avoided. The problem with joins is that as the input data increased by X times, the data points in the joins increased by X square times where two joins were used and X cube times where 3 joins were used and so on. Avoiding joins helped me to reduce the runtime of CPI-DS jobs from 5 hours to 5 minutes in one of the applications. I have explained below one of the methods of avoiding joins:

APPLICATION

Input Data consists of the following attributes:

LocationID

ProductID

Unit Cost

Other Attributes

We need to get maximum of Unit Cost for every LocationID and ProductID combination and other attributes should equal to the one for whichever entry is selected (eg Supplier ID)

Join Logic

Logic Using Joins

Here, 'MaxPrice' filters out all the entries except the one with maximum unit cost for any location and product combination. An Inner join is then used in the next transform to get the other attributes from the source for the selected entry.

Logic Without Joins

Logic Without Joins

The 'prdlocgeomktkey' transform orders all the records by the unit cost in descending order so that for a particular location and product combination, the maximum unit cost always comes at the top. The 'genrownum' transform then generates unique numbers for same combinations of location and product. The first entry in a location product combination will be numbered as 1 and the subsequent duplicate entries will be numbered from 2 onwards. As the maximum unit cost for a particular location and product combination will always have the row number generated as 1, it can then be filtered out in the next transform using the logic that row number should be one.

RESULTS

The below image shows how the runtime reduced by 98% by merely changing a small logic.

CONCLUSION

CPI-DS servers are not very proficient in handling huge amounts of data and calculations. Therefore, joins should be kept specific to small datasets and should be avoided as much as possible. Most of the times self joins can be avoided by using multiple functions such as ordering and generating row number by groups.

I would also like to learn from the community how you have eliminated joins or used any kind of innovative logic in transforms to reduce runtime. Please comment if you have done anything similar in your field of work.

SAP Managed Tags:
Cloud Integration,
SAP Cloud Integration for data services,
SAP Integrated Business Planning for Supply Chain

2 Comments

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Count

Reducing Runtime of CPI-DS Tasks by Avoiding Self Joins

SAP PI for Beginners

ABAP 7.40 Quick Reference

Fiori: technical installation and configuration of one app from A - Z