Skip to Content
Personal Insights
Author's profile photo Dmitrii Sharshatkin

Optimize ABAP CDS Performance – Double-Query Approach

Hello all !

Quite some time ago, I found a nice way to overcome some performance and logical issues when using ABAP CDS views.

Scenario Description

I was using standard SADL framework to request the data for my OData services from ABAP CDS views, but I was constantly running into either performance or functional issues.

Regarding performance, I guess, it’s clear – all was slow, even GET_ENTITY.

Regarding functional ones, it’s a bit more complicated. But I will try to explain. OData v2 (especially in SAP’s realization) does not really support complex queries through dependent objects.

For instance you have a document header and then a document item with product, a partner, dates etc, and all with 1:N cardinality to the header. Now imagine you need to search trough those products + partners + dates + some header fields simultaneously. If you never programmed any OData, just believe me, it won’t be so easy, if possible at all…

Idea

And so, I came to the idea to split the queries into two:

  • First, we read distinct CDS Entity Key, based on the provided search criteria;
  • Second, we read the required CDS data, based on the selected keys.

As you may notice, this will allow us to use two different CDS views. One is Search View and another one is Data View. Data View represents CDS Entity and by definition has one line per Entity. Search View is needed to get keys, so that it may have whatever you want, any fields, from any entities (product/partner/dates) and must not be 1:1 to the entity.

Note: this approach may work, but also may not. Or even sometimes it will be working (fast), sometimes not. It depends…

 

My Example

Let’s come to one of my examples….

I have a CDS view, containing 33 very big tables.

My CDS View Definition

After adding date information (marked red box), the performance got just terrible.

So, I’ve changed the key of the entity from Object ID (NUM10) to Object GUID (RAW16), which is then used everywhere as JOIN condition, but also a part of primary database indexes.

Then I had to split the query into two as described above.

 

How Was It Originally

So, imagine you have your SADL implementation and request one entity (GET_ENTITY).

The following query will be executed:

Simulation of SADL Query

HANA Studio shows about 15 seconds of runtime. Not bad just for one line…

Plan Viz:

 

Result: One query is executed in 12 seconds, 45+ million rows read, 28 GB RAM allocated. Not bad!

P.S. Obviously our keys are not passed to the views, and they are read fully.

 

New Implementation

Ok, now let’s try to split the queries…

First Query (Getting CDS Key):

Result: Query is executed in 10 milliseconds, 1 row was read, 1,2 MB RAM allocated.

 

Second Query (Getting Data by CDS Key):

 

Result: Query is executed in ~100 milliseconds, single rows selected, 43 MB RAM allocated.

 

Just to sum it up…

Way to get data Response time # Rows RAM Consumption
Single query ~ 12 seconds ~ 50.000.000 ~ 29 GB
Double query ~ 100 milliseconds ~ 30 ~ 45 MB

 

My OData Framework

Writing complicated GET_ENTITY or GET_ENTITYSET methods might become boring, so that I’ve developed a framework, which implement my idea. With that the implementation of the GET_ENTITYSET methods looks like below:

Example of GET_ENTITYSET method

Looks quite elegant. Feel the same, give it a LIKE.

All the best!

Dima

 

P.S. Code sample is here.

Assigned Tags

      6 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo David Henn
      David Henn

      What happens if you join your two CDS Views? 🙂

      Author's profile photo Mehmet Hangisi
      Mehmet Hangisi

      Why didn't you use associations instead of outer joins? In your model the data will always be retrieved for the outer joins, if you'd have used associations you could have avoided the retrieval of the data from the associations unless it was explicitly requested.

      If you've done the same in the other custom CDS views which you're using in ZS_CV_MY_ACTIVITIES, meaning also using (left outer) joins instead of associations, switching over to associations undoubtedly will achieve performance gains for you.

      Author's profile photo Dmitrii Sharshatkin
      Dmitrii Sharshatkin
      Blog Post Author

      Hi Mehmet, with LEFT OUTE JOIN, the data is never retrieved unless it's requested, so that in this regard LEFT OUTER JOIN is working exactly the same way as just ASSOCIATION.

      "When a CDS view is activated with path expressions, every association specified here is transformed to a join expression." + "the category of the join is determined by where the path expression is used: After FROM, it is an inner join (INNER JOIN), In all other locations, it is a left outer join (LEFT OUTER JOIN)"

      https://help.sap.com/doc/abapdocu_750_index_htm/7.50/en-US/abencds_f1_association.htm

      BR, Dima

      Author's profile photo Mehmet Hangisi
      Mehmet Hangisi

      Hi Dmitrii,

       

      I was referring to the "JOIN-ON-DEMAND" aspect of associations meaning the association in a CDS view will be converted to a join at runtime when data originating from the association has been requested by the consumer of the CDS.

      The outer joins you have created will always be executed, it doesn't matter whether the data retrieved by these joins have been requested or not.

      Had you used an association, those outers joins would not have been executed by default, only in those cases where data is explicitly requested from the association a join will be peformed at runtime. This should lead to performance gains on database layer since you would eliminate "redundant" joins.

      Author's profile photo Dmitrii Sharshatkin
      Dmitrii Sharshatkin
      Blog Post Author

      Hi Mehmet,

      Oh, now I’ve got your point. So, if we substitute LOJ by ASSOCIATIONs and then build a dynamic SELECT and WHERE statements through associations and their fields, we can potentially simplify the query (less joins generated) and potentially improve performance. Yes, you are right.

      But unfortunately, in my case I need to read lots of fields, usually almost all, from all tables. And such a dynamic generation of SELECT and WHERE statements seems overcomplicated to me. But I will consider your proposal for underlying CDS. Probably there are some use-cases, where it’s beneficial.

      Thanks, Dima

      Author's profile photo Dmitrii Sharshatkin
      Dmitrii Sharshatkin
      Blog Post Author

      Coding update: added GET_EXPANDED_ENTITYSET with optimized performance.