Skip to Content

Many SAP Data Services (DS) applications use SAP ECC data as a source. It now happens that DS supports multiple mechanisms to extract data from SAP ECC. Which one to choose? There often is a preferred method, depending on the functionality required and on the capabilities offered within the actual context of the source system. In this blog, I discuss all different options.

Start with configuring a data store for the SAP source system:

Figure 1: SAP Data store definition

 

Mandatory settings are User Name and Password, the name (or IP address) of the SAP Application server, the Client and System (instance) number.

1/. “Regular” data flow

When extracting from a single table, do that in a standard data flow. Import the table definition from the Metadata Repository in the Data store Explorer and use it as source in a data flow.

An extraction from the KNA1 customer master data table looks like this:

Figure 2: Data flow – extract from KNA1

 

Note the little red triangle in the source table icon. It’s an indication of a direct extract from the SAP data layer.

This is the simplest method. No special measures are necessary at the level of the source system. The extraction will run as a SAP dialog job. Note that there is a time-out setting (system-configurable, typically 10, 30, 60 minutes) for dialog jobs. The DS job will fail when data extraction takes longer.

The where conditions are pushed down to the underlying database. This is beneficial for the job performance. Also, make sure to extract only the columns that are really required. The extraction process duration is linearly related to the data volume, which equals number of records * average records size. The less data is transferred, the faster the DS job runs.

Figure 3: Query transform – extract from KNA1, excluding obsolete records

Figure 4: Generated SQL code – the where-clause is pushed down

 

This approach is especially beneficial when implementing incremental loads. When the source table contains a column with a last-modification timestamp, you can easily implement source-based changed-data capture (CDC). Keep track of the timestamps you used in the previous incremental extraction (use a control table for that), initialize global variables with those values and use them in the where-clause of your Query transform.

Figure 5: Query transform – extract recently created or modified records from MARA

Figure 6: Generated SQL code – source-based CDC

 

2/. ABAP data flow

Although where-conditions are pushed down to the underlying database from a normal data flow, the joins are not (sort and group-by operations aren’t either!), often leading to abominable performance, especially when dealing with larger data volumes.

When you want to extract material master data from MARA and supplement each record with the English material description from MAKT, you can build a data flow like this:

Figure 7: Data flow – extract from MARA and MAKT

Figure 8: Query transform – join MARA and MAKT on MANDT and MATNR columns

Figure 9: Generated SQL code – no join

DS generates two SQL statements. It first extracts all current records from MARA. And then for each individual record it retrieves the corresponding English description (MATNR = AIVariable_1 and MANDT = AIVariable_2). That approach leads to as many round-trips to the underlying database as there are records in the MARA table! It is only a valid one when dealing with smaller sets of data.

You may improve performance by changing the properties of the source tables.

The default settings are:

Figure 10: Default source table properties

 

Making MARA the driving table (by giving it a higher Join rank than MAKT) and caching MAKT in memory leads to a completely different generation of SQL code, without singular round-trips to the database. The MARA table streams through the data flow, the MAKT table is cached, and the join is resolved in DS memory.

Figure 11: Modified source table properties

Figure 12: Generated SQL code – cache MAKT

 

The feasibility of this approach is impacted by 2 parameters:

  • The amount of memory available for caching
  • The time it takes to cache the MAKT table

This works perfectly for smaller tables. But it’s not a well performing solution either when MAKT and MARA are too large

The recommended solution for extracting from a join of SAP tables is through the use of an ABAP data flow.

Figure 13: ABAP Data flow – extract from MARA and MAKT

 

DS generates ABAP code corresponding to the properties of the source tables and the logic of the dataflow. The table with the highest Join Rank becomes the driving table. Also in this case, the duration of the extraction process is linearly related to the data volume: the less data is transferred, the faster the DS job runs.

Figure 14: Generated ABAP code

 

The ABAP code is pushed to the SAP system and executed there. The program results only are sent back to DS for further processing. This approach only works when the SAP system is open for development! Also, make sure that:

  • The ABAP execution option is set to Generate and Execute.
  • The Execute in Background (batch) property is set to Yes to avoid the time-out in SAP dialog jobs.
  • The Data transfer method is set to RFC. The RFC destination must be defined in the SAP system. The other Data transfer methods are there for upward-compatibility reasons only and must not be used anymore. They all lead to inferior performance.

Figure 15: SAP Data store definition

 

To execute the same DS job in non-development tiers of your landscape, transport the ABAP program from DEV to TST, PRD… first. Set the ABAP execution option to Execute Preloaded and run the DS job. It will not generate the ABAP code again, but execute the transported code.

ABAP data flows are also a convenient solution for implementing incremental loads for ECC tables that do not contain a last-modification timestamp column. Inserts and modifications in KNA1 are registered in the CDHDR and CDPOS tables. Use an ABAP data flow to join KNA1 to those tables. Make sure CDHDR gets the highest Join Rank, KNA1 the lowest, in order to get the most efficient code generated. And include the where conditions to:

  • Filter out current customers
  • Get the recently modified records only
  • From the correct entries in the log tables

Figure 16: ABAP Data flow – extract from KNA1

Figure 17: Query transform – join KNA1 with CDHDR and CDPOS

Figure 18: Query transform – extract recently created or modified records from KNA1

3/. SAP Extractor

SAP Extractors are tailored for BW Business Content. They contain all the logic for common business transformations, possible aggregations, and also how to identify changes already, so they are very suitable for implementing incremental loads.

The DS Extractor feature is built on top of the Operational Data Provisioning (ODP) API. DS supports all types of ODP sources supported by the ODP API, including CDC functionality.

In a regular data flow, DS uses RFC to call the ODP data replication API. The where-conditions are not pushed down to the extractor. That means all data are pulled from the extractor and filtering is done afterwards in DS. Import the ODP object definition from the Metadata Repository in the Data store Explorer.

Figure 19: Import extractor 0CUSTOMER_ATTR

 

Make sure to set the Extraction mode to Query. Then use it as source in a data flow. An extraction from the 0CUSTOMER_ATTR ODP object looks like this:

Figure 20: Data flow – extract from 0CUSTOMER_ATTR

 

If you want to extract a minor subset of the data only, use the ODP object as source in an ABAP data flow.

Figure 21: ABAP data flow – extract from 0CUSTOMER_ATTR

 

Add the where-clause to the Query transform.

Figure 22: Query transform – extract current American customers from 0CUSTOMER_ATTR

 

DS generates the ABAP that calls the ODP data replication API. The generated code contains the logic for filtering out unnecessary records.

Figure 23: Generated ABAP code

 

Implementing CDC is really a piece of cake for those extractors that are “delta-enabled”. Make sure to set the Extraction mode to Changed-data capture (CDC). When import the ODP object.

Figure 24: Import extractor 0PROJECT_ATTR

 

Then use it as source in a data flow. No need to add any time-based condition in the where-clause. The extractor logic will guarantee that only new and modified records are passed to DS. Just make sure that the Initial load property of the ODP object is set to No. Only set it to Yes when you want the target table to be re-initialized.

Figure 25: ODP object properties

 

Include a Map_CDC_Operation transform to automatically synchronize the target table with the source object. The transform will translate the Row Operation value into the corresponding DS row type:

  • I > Insert
  • B & U: before and after-image of an Update
  • D > Delete

Figure 26: Data flow – delta extract from 0PROJECT_ATTR

 

4/. ECC function

DS can call RFC-enabled ECC functions returning tables as data flow sources, as well. If a standard ECC function is not RFC-enabled, you need an RFC-enabled wrapper function that passes the parameters to the standard function, calls it and forwards the results to DS.

You can only import a function’s metadata by name. Call it from a query transform by selecting New Function Call… from the pop-up menu in its output schema. Select the function from the ECC data store. Define Input Parameter(s) and select Output Parameter. The function call is added to the output schema.

Figure 27: Query transform – call an ECC function

 

Then unnest the return results in a next Query transform before writing it into a target table.

Figure 28: Query transform – unnest the function output schema

5/. SAP LT (SLT) Replication Server

DS can use SLT Replication Server as a source. This is a very neat and elegant way to build CDC jobs in DS. Working with SLT objects is similar to the way DS works with SAP extractors.

Define the SLT data store like any other SAP data store. Just make sure you select the right ODP context in the definition.

Figure 29: SLT Data store definition

 

You can then import the tables you need to extract data from.

Figure 30: SLT table metadata

 

Use the ODP object as source in a data flow. The first time the data flow is executed it will do a full extract of the underlying table. All successive runs will automatically perform an incremental one and only forward the delta.

Figure 31: Data flow – delta extract from MARA through SLT

 

6/. IDOC

DS real-time jobs can read from IDOC messages and IDOC files. DS batch jobs can only read from IDOC files.

Import the IDOC metadata definition by name from the SAP data store. Use the IDOC as a file source in a data flow within a batch job.

Figure 32: Data flow – delta extract from IDOC File Source

 

Double-click on the IDOC icon to open its definition and specify the name of the IDOC file. You can use wildcards (? and *) or list multiple file names separated by commas if you want to process multiple files in a single data flow.

Figure 33: IDOC properties

 

Generate the IDOC file(s) from SAP and run your DS job.

To report this post you need to login first.

30 Comments

You must be Logged on to comment or reply to a post.

  1. Roman Hess

    Hello Dirk,

    how did you created the DEBMAD06.txt in “Figure 33: IDOC properties” ?

    I tried it with .xml files downloaded form SAP GUI but no success here.

    Thank you!

    (0) 
    1. Dirk Venken Post author

      I didn’t, I received it :).

      Google “how to generate an idoc file from sap”, and you’ll find literally thousands of hits.

      (0) 
      1. Sudip Banerjee

        Hi Dirk,

        Is this IDOC file containing actual data or structure only? Actually I tried with IDOC (ORDERS05) file generated from SAP (tcode – WE63) and used the same but no success. Also I tried with some XML source file aligned with the same IDOC structure but there also no success. Error message is below

        Processing the IDoc from the file <G:/BODS/Batch_files/89d9448_1754HTML000003.xml> failed 
        because Data Services no longer supports its format. Regenerate the IDoc file in SAP and 
        replace the file.

        Kindly suggest if I am missing anything here. Thanks in advance.

        Thanks & Regards

        Sudip

        (0) 
  2. Nazeer Mohammed

    Can you please explain a bit on the connection details between SAP ECC and BODS?

    We have a requirement for connecting BODS to SAP ECC for ODP extraction.

    I am very thankful if you could share the connection details, any pre-requisites required and what sort of authorisation the service RFC user should have?

    Please explain a bit.

     

    Thanks

    (0) 
    1. Dirk Venken Post author

      Make sure that your input schema contains the attributes and/or sub-schemas to map to the input parameters of the function.

      (0) 
  3. somu sridevi

    Hi Dirk,

    I am trying to data from extractor (delta enabled). Full load is working fine where as delta load is not working properly.Selcted Change-Data-Capture option while importing the extractor and passed ‘No’ value Initial Load property. Any inputs for this error?

    The delta extraction is keep on running with out any error in BODS console

     

    (0) 
  4. Liju Stanley

    Hi Dirk,

    I have been trying to implement the combination of SLT with DS. I have read through your blog and followed the below document.

    https://assets.cdn.sap.com/sapcom/docs/2015/07/62c8dbf4-5a7c-0010-82c7-eda71af511fa.pdf

    I have used a separate SLT server to fetch data from SAP system. I am facing some issues and request you to throw some light on it.

    – I am seeing the tables of SLT system in DS system. Shouldn’t I be seeing only the ODP  objects?

    – If I search for a table in external metadata of datastore it won’t show any results when the table is actually present under ODP objects. I imported the metadata of a table  and tried viewing data, it will show the status” fetching data” but will never show any.

    I can see entries in ODQMON in SLT. Tried creating a job in DS to fetch data and write it to a target system. The table gets created in the target system without data.

    On further investigation it showed that in SLT, ODQ jobs are getting released but not running. Do you see any reason why this happens?

    Regards,

    Liju

    (0) 
  5. Rubal Taneja

     

    Hi ,

    I am working on “BODS reliable asynchronous web service call” but getting

    error while communicating with server”.

    BODS Error log :-

    1012410508XML-Unknown error communicating with server.

    Error at target system –

    ” SRT: plain soap: reliable messaging (rm) configured, but no message id and no wsrm assertion provided. ”

    is their any way to populate Header node (with Action, Addressing, unique Message ID) while calling Web Service from BODS.

    I found on the target system that some extra elements that are getting on to Request message when we trigger that web service from SOAP UI by selecting those WS-A elements in case of Async Reliable communication.

    usually, Request structure starts from Body but in this case we have extra node Header

    when we are connecting this web service using SOAP UI

    BUT it is only working when we are enabling this WS-A checkbox in SOAP UI , not working with BODS designer.

    .

    any help on this ? please let me know if you have any documents related to reliable asynchronous web service call .

    Regards,

    Rubal Taneja

    (0) 
  6. Liju Stanley

    Hi DIrk,

    Thanks for the reply. The issue was the SLT work process and it is resolved.

    I have another query

    Every time I schedule a job from DataService, I am able to extract data from ODQ of SLT.

    Will I be able to extract data from ODQ on a real time basis? i.e s as soon as SLT writes to ODQ, Data service extracts it from ODQ and writes to a target?

     

    Regards,

    Liju

     

    (0) 
    1. Dirk Venken Post author

       

      Almost. You can make a DS job poll for changes. Embed your data flows in an eternal loop, reading from the queues and writing to the target.

      (0) 
  7. bindu duvvuru

    Please answer the below question

    if I want to extract the only delta records into HANA from SAP or Non-SAP sources through BODS, What is the process and which transformation we need to use in BODS?

    (0) 
    1. Dirk Venken Post author

      In general you can consider 5 following options:

      • Use last_modification timestamps in the source tables. Keep track of the timestamp(s) that you used in the previous incremental extraction (use a control table for that), initialise global variable and use them in the where-clause of your Query transforms.
      • Use a log table. Many systems log modifications in so called log-tables. The actual table modified does not contain the timestamp, but the log table does. Join the tables and proceed as above.
      • Use the built-in CDC-mechanisms of the source database;
      • Use specific CDC software, like SLT Replication Server.
      • If none of those apply, do a full load, keep the copy of the previous run, and calculate the delta yourself. Then apply transformations to the delta only.

      With SAP as a source, there’s a 6th option:

      • Leverage delta-enabled Business Content Extractors
      (1) 
    1. Dirk Venken Post author

       

      As of the official documentation, SAP Data Services Supplement for SAP, section 10.1.1.4 Source:

      In an ABAP data flow, a source can be another ABAP data flow, an SAP application table, a file located on the SAP application server, or an SAP ODP source.

       

      (0) 
  8. Marc Mclaughlin

    Hi Dirk,

    Thanks for this write up, very helpful.  I’m trying to use the “Regular” method above to pull delta information from the faglflexa table.  As I understand it from our functional team, the only “last modified” type dates I can rely on for this table live at the header level – bkpf.  What I’m hoping to do is inner join the two tables directly in the data flow and use the where clause to filter on the dates I need, but what I’m seeing is that it generates two queries that pull everything from both tables.  I’m wondering if what I’m trying to do is even possible?  If so, how can I get it to push down the join and date filtering condition?

     

    Thanks,

     

    Marc

    (0) 
    1. Dirk Venken Post author

       

      As I have tried to explain

      Although where-conditions are pushed down to the underlying database from a normal data flow, the joins are not (sort and group-by operations aren’t either!)

      joins are only pushed down from ABAP data flows.

      (0) 
  9. Juraj Danko

    hi Dirk, great article

    I’m working on something similar at the moment as you described in this blog, I try to consume ODP extractor and save data in the table, but it’s not working as expected

    My Z-datasource is a table based with delta enables on a numeric pointer (table MDKP), I’ve imported it with extraction method CDC. When I execute the job with Init set as Yes, I see the init request in ODQMON and data are correctly loaded, unfortunately when I switch the init to No (I want to receive delta), although there are new units in the monitor, the job doesn’t raise new request and seems data are loaded from pagable cache
    I’ve switched off all statistics

    do you know what can be the problem?

    thanks

    Juraj

     

    (0) 
  10. Jayasree R

    Hi Dirk,

    We are using ABAP dataflows in our project, we have applied some filters on date column.

    To load some missing records, we have removed the filters in ABAP dataflow and ran the job without generating ABAP code, it is execute preloaded set in datastore, but strangely that missing records due to filters are loaded into target. We need to know, is the abap code generation not required if we change anything in abap dataflow.

    (0) 

Leave a Reply