Design option to delete the records from Info prov...

Former Member · ‎12-23-2014

Applies to: SAP BI 7.X

Summary:

Normally we use flat file loads to SAP BW loading either to DSO and/or info Provider with an assumption that the file sent is always with the delta changes. Otherwise, we will drop and reload the DSO/Info Providers every time we receive the files. These approaches are fine if the file size is reasonably small which does not take lot of the processing time to complete the data loads. In case, we have huge volume of data and have additional upward layers to feed the same data, this approach is time consuming and process centric. Here we are going to discuss the design option to overcome data issues in the file such as nullified/deleted records in the source that need to be deleted in the upward Info Providers as well.

Author(s): Anil Kumar Puranam and Peesu Sudhakar.

Company: Deloitte

Created on: 23rd December, 2014

Authors Bio:

Sudhakar Reddy Peesu is an SAP Technology professional with more than 15 years of experience leading SAP BW/ABAP implementations, developing custom applications, and developing IT strategy. He is a PMP certified manager, has led BW/ABAP teams with hardware and landscape planning, design, installation and support, and development teams through all phases of the SDLC using the onsite/offshore delivery model. He has strong communication, project management, and organization skills. His industry project experience includes Banking, Consumer/Products, and Public Sector industries

Anil Kumar. Puranam is working as a Senior BW/BI Developer in Deloitte consulting. He has more than 9 years of SAP BW/BI/BO experience. He has worked on various support/implementation projects while working in consulting companies like Deloitte, IBM India, and TCS.

Business Scenario:

In BW implementations, there is a lot of importance to the data extraction from legacy systems through flat files. We extract the master data and transaction data in structured format from ECC systems to BW. However when we are moving the operational systems from legacy tools to ECC, we can see lot of data extraction from external systems.

In phase of moving our applications from legacy to ECC systems, we need to extract historical as well as every day transactions to BW through flatfiles. Most of the time, we need to bring the data from legacy applications to BW as full loads until and unless the source applications are capable to generate deltas. When the tools are not capable to provide the deltas as like as ECC systems, we build the DSO layer as the staging layer. And also we can use Info cube as the target to load the data directly from the cube. If we choose the later approach, then we can build a custom logic to build the before images with reversal values.

Here is the solution that helps us to generate the delta records while loading the data into the cube directly without any intermediate the DSO layer.

The below picture gives the data flow overview:

Sample data :

In order to understand the business scenario better, the below example is discussed with some sample data and behavior of delta mechanism.

Note: Sample data provided is only for requirement understanding; it does not contain any data from the client (Actual Data).

Sample Transaction Data:

Important points of the data:

Job id and Start time makes the unique combination.
End time is not available for running job.
End time is available for the job if it is either failed or completed.
Failed job can be marked as successful later.

Technical Challenge:

The file content that we receive from source is not having the information about the before image and it always provide the latest information for the combination of job and Start time. We see this sort of challenge more often for the flat file loads from legacy systems into BW.

We can choose DSO layer as intermediate layer to generate the delta records to ultimate target Info cube. However the possibility to use DSO layer is really challenging if we have below situations:

If we need to load multiple files to same DSO but at different times. So we will not have option to load the data from one file and activate the DSO at the same time while we are processing other file. So we need to wait and need to develop the mechanism not to process one file while other file is loading. This will delay the whole data load timings.
And also we cannot develop number of DSOs and Data sources if we need to handle hundreds of files.
Loading, Activation and further loading will become sequential.

We can choose the option of loading to cube directly from data source, but the challenge is how to manage the changes for already loaded records into the cube.
For ex: in the above screen, we can see that job ID “A15GID” with start time stamp “20140228 193454” has been changed from Running to Failed and to Success in three files that we have received.

In order to address the above technical challenges, below approach is followed.

Approach to overcome the Challenge:

We can use below methodology to overcome the challenge:

Read the Target Info cube while loading the data, only for the records which needs to replace the existing records.
For ex: by looking at first and second file, we can say that job ID “A15GID” with start time stamp “20140228 193454” has been changed from Running to Failed. So while loading the second file, we need to overwrite or nullify the record with status is equal to Running, so that we can show only one truth to the users.

Add the records in the end routine exactly as equal as old record of the cube , which we want to nullify . But here we add the record with reversal value for the Key Figures, so that the aggregated value for the combination of Characteristics will be zero.

Detail steps:

Step 1: Created an Info cube with below structure. Here we have used the Info object 0TCTDURTION to calculate the duration of the job. The value is calculated only for the jobs which are either failed or successfully completed. We do not calculate the duration for the jobs which are running.

Step 2: Created a Data source to the load the flat file. Here we have used the ZTIME conversion for converting the timestamp value that we receive from source into BW internal format. And also we have developed custom conversion Function modules like CONVERSION_EXIT_ZTIME_INPUT and CONVERSION_EXIT_ZTIME_OUTPUT. The codes of the function module are also attached at the end.

We have used new Z conversion routine, Since TIMES conversion routine is displaying the values in UTC time zone, which we do not want to implement and want to show the values as it is how we receive from source. And also we have used the ZTIME conversion in the 2 Time stamp Info objects.

Step 3: In the transformation from Data source to Cube, we have called a function module ZINFOPROV_READ_STATS, which in turns uses the RSDRI_INFOPROV_READ Function module.

Highlights of the code

While calling the Function module ZINFOPROV_READ_STATS, we have given Cube name also as input parameter. Here we have common FM ZINFOPROV_READ_STATS to use it for different cubes which are same structure as like as the cube that we are referring in this example.
Apart from cube name, we can see that Job name and Start time are as input parameters , since they are the important characteristics to identify the previous records from the cube.
Few lines of code are written to derive the target cube name in the end routine to pass to the FM call ZINFOPROV_READ_STATS.
Reverting the value for Keyfigure values in the Function module
We have created and used one ‘Line type’ and ‘Table Type’ dictionary objects in the coding.

Both codes used in the End routine and Function module are attached at the end of the document.

The below screen highlights Key points of the code used in the End routine.

Data in the Target cube after each load:

For the records with Start Time Stamp equal to "20140228 193454", we can see the final records in the cube as below.

Here we can see that first time load has inserted only one record. The record has status as "RU".
Second time load has inserted 2 records: one record inserted in second load is reversal of record loaded in first load. i.e Running status record has got zero value in the cube by end of the second load.
Third time load has inserted total 3 records. one record inserted in third load is reversal of record loaded in Second load with the status of "FA". i.e "FA" status record has got zero value in the cube by end of the third load. Here the Third load also brought reversal of record with Staus "RU", but it has brought the reversal for the summed value till second load. So we can see one record with status "RU" . but the value is Zero, since the summed value for RU status record is Zero by end of second load.

Cube Manage Screen:

Report out put :

Appendix A : code used in the End routine:

DATA IT_CUBE TYPE ZGL_ZGL_C110_TT.
DATA WA_CUBE TYPE ZGL_ZGL_C110_LT.
DATA: WA_RESULT_FLD TYPE _TY_S_TG_1.
DATA: RESULT_PACKAGE2 TYPE _TY_T_TG_1.

DATA: LP_CUBE_NAME TYPE RSINFOPROV.
DATA: LCL_REQUEST TYPE REF TO CL_RSBK_REQUEST_PROXY.
LCL_REQUEST ?= P_R_REQUEST.
CALL METHOD LCL_REQUEST->IF_RSBK_REQUEST_ADMINTAB_VIEW~GET_TGT
RECEIVING
R_TGT = LP_CUBE_NAME.

DELETE RESULT_PACKAGE WHERE /BIC/ZSTARTTS IS INITIAL AND /BIC/ZJOB_ID IS INITIAL.

IF NOT RESULT_PACKAGE IS INITIAL.

LOOP AT RESULT_PACKAGE ASSIGNING <RESULT_FIELDS> WHERE
/BIC/ZJBSTATUS <>   'RU'.
    CLEAR IT_CUBE.
    CALL FUNCTION 'ZINFOPROV_READ_STATS'
      EXPORTING
        ZCUBE      = LP_CUBE_NAME
        ZSTARTTS   = <RESULT_FIELDS>-/BIC/ZSTARTTS
        ZJOB_ID    = <RESULT_FIELDS>-/BIC/ZJOB_ID
      CHANGING
        CT_GL_CUBE = IT_CUBE.
    IF IT_CUBE IS NOT INITIAL.
      LOOP AT IT_CUBE INTO WA_CUBE.
        CLEAR WA_RESULT_FLD.
        WA_RESULT_FLD-/BIC/ZBWSYS = WA_CUBE-ZBWSYS.
        WA_RESULT_FLD-CALDAY = WA_CUBE-0CALDAY.
        WA_RESULT_FLD-CALMONTH = WA_CUBE-0CALMONTH.
        WA_RESULT_FLD-/BIC/ZJOB_ID = WA_CUBE-ZJOB_ID.
        WA_RESULT_FLD-/BIC/ZSTARTTS = WA_CUBE-ZSTARTTS.
        WA_RESULT_FLD-/BIC/ZENDTS = WA_CUBE-ZENDTS.
        WA_RESULT_FLD-/BIC/ZJBSTATUS = WA_CUBE-ZJBSTATUS.
        WA_RESULT_FLD-TCTDURTION = WA_CUBE-0TCTDURTION.
        WA_RESULT_FLD-/BIC/ZCNT_KEY = WA_CUBE-ZCNT_KEY.

        APPEND WA_RESULT_FLD TO RESULT_PACKAGE2.
      ENDLOOP.
    ENDIF.

ENDLOOP.

IF RESULT_PACKAGE2 IS NOT INITIAL.
APPEND LINES OF RESULT_PACKAGE2 TO RESULT_PACKAGE.
ENDIF.
ENDIF.

Appendix B : code used in the Function Module:

FUNCTION zinfoprov_read_stats.
*"----------------------------------------------------------------------
*"*"Local Interface:
*" IMPORTING
*"     REFERENCE(ZCUBE) TYPE RSINFOPROV
*"     REFERENCE(ZSTARTTS) TYPE /BIC/OIZSTARTTS
*"     REFERENCE(ZJOB_ID) TYPE /BIC/OIZJOB_ID
*" CHANGING
*"     REFERENCE(CT_GL_CUBE) TYPE ZGL_ZGL_C110_TT
*"----------------------------------------------------------------------

TYPE-POOLS: rs, rsdrc.

TYPES:
    BEGIN OF gt_s_data,
      ZBWSYS          TYPE /BIC/OIZBWSYS,
      0calday            TYPE /bi0/oicalday,
      0calmonth           TYPE /bi0/oicalmonth,
      zjob_id       TYPE /bic/oizjob_id,
      zstartts         TYPE /bic/oizstartts,
      zendts         TYPE /bic/oizendts,
      zjbstatus       TYPE /bic/oizjbstatus,
      0tctdurtion     TYPE f,
      zcnt_key        TYPE i,
    END OF gt_s_data.

DATA:

g_s_data TYPE gt_s_data,

* G_T_DATA = an internal table that can hold the result set

g_t_data TYPE STANDARD TABLE OF gt_s_data
WITH DEFAULT KEY INITIAL SIZE 10,

* G_S_SFC = description of a characteristic or navigational attribute
*            that is requested by a query
    g_s_sfc         TYPE rsdri_s_sfc,
    g_th_sfc        TYPE rsdri_th_sfc,

* G_S_SFK = description of a key figure that is requested by a query
g_s_sfk TYPE rsdri_s_sfk,
g_th_sfk TYPE rsdri_th_sfk,

* G_S_RANGE = description of a restriction on a characteristic or
*             navigational attribute
    g_s_range       TYPE rsdri_s_range,
    g_t_range       TYPE rsdri_t_range.

* For the following characteristics / navigational attributes values
* should be returned:
*

CLEAR g_th_sfc.

* ZBWSYS
CLEAR g_s_sfc.
* --- name of characteristic
g_s_sfc-chanm = 'ZBWSYS'.
* --- name of corresponding column in G_T_DATA
g_s_sfc-chaalias = 'ZBWSYS'.
* --- no ORDER-BY
g_s_sfc-orderby = 0.
* --- include into list of characteristics
INSERT g_s_sfc INTO TABLE g_th_sfc.

* CALDAY
CLEAR g_s_sfc.
* --- name of characteristic
g_s_sfc-chanm = '0CALDAY'.
* --- name of corresponding column in G_T_DATA
g_s_sfc-chaalias = '0CALDAY'.
* --- no ORDER-BY
g_s_sfc-orderby = 0.
* --- include into list of characteristics
INSERT g_s_sfc INTO TABLE g_th_sfc.

* CALMONTH
CLEAR g_s_sfc.
* --- name of characteristic
g_s_sfc-chanm = '0CALMONTH'.
* --- name of corresponding column in G_T_DATA
g_s_sfc-chaalias = '0CALMONTH'.
* --- no ORDER-BY
g_s_sfc-orderby = 0.
* --- include into list of characteristics
INSERT g_s_sfc INTO TABLE g_th_sfc.

* ZJOB_ID
CLEAR g_s_sfc.
* --- name of characteristic
g_s_sfc-chanm = 'ZJOB_ID'.
* --- name of corresponding column in G_T_DATA
g_s_sfc-chaalias = 'ZJOB_ID'.
* --- no ORDER-BY
g_s_sfc-orderby = 0.
* --- include into list of characteristics
INSERT g_s_sfc INTO TABLE g_th_sfc.

* ZSTARTTS
CLEAR g_s_sfc.
* --- name of characteristic
g_s_sfc-chanm = 'ZSTARTTS'.
* --- name of corresponding column in G_T_DATA
g_s_sfc-chaalias = 'ZSTARTTS'.
* --- no ORDER-BY
g_s_sfc-orderby = 0.
* --- include into list of characteristics
INSERT g_s_sfc INTO TABLE g_th_sfc.

* ZENDTS
CLEAR g_s_sfc.
* --- name of characteristic
g_s_sfc-chanm = 'ZENDTS'.
* --- name of corresponding column in G_T_DATA
g_s_sfc-chaalias = 'ZENDTS'.
* --- no ORDER-BY
g_s_sfc-orderby = 0.
* --- include into list of characteristics
INSERT g_s_sfc INTO TABLE g_th_sfc.

* ZJBSTATUS
CLEAR g_s_sfc.
* --- name of characteristic
g_s_sfc-chanm = 'ZJBSTATUS'.
* --- name of corresponding column in G_T_DATA
g_s_sfc-chaalias = 'ZJBSTATUS'.
* --- no ORDER-BY
g_s_sfc-orderby = 0.
* --- include into list of characteristics
INSERT g_s_sfc INTO TABLE g_th_sfc.

* For the following key figures should be returned:
* 0TCTDURTION, ZCNT_KEY

CLEAR g_th_sfk.

* 0TCTDURTION
CLEAR g_s_sfk.
* --- name of key figure
g_s_sfk-kyfnm = '0TCTDURTION'.
* --- name of corresponding column in G_T_DATA
g_s_sfk-kyfalias = '0TCTDURTION'.
* --- aggregation
g_s_sfk-aggr = 'SUM'.
* --- include into list of key figures
INSERT g_s_sfk INTO TABLE g_th_sfk.

* ZCNT_KEY
CLEAR g_s_sfk.
* --- name of key figure
g_s_sfk-kyfnm = 'ZCNT_KEY'.
* --- name of corresponding column in G_T_DATA
g_s_sfk-kyfalias = 'ZCNT_KEY'.
* --- aggregation
g_s_sfk-aggr = 'SUM'.
* --- include into list of key figures
INSERT g_s_sfk INTO TABLE g_th_sfk.

CLEAR g_t_range.

* /BIC/ZSTARTTS WITH SAME JOB
CLEAR g_s_range.
* --- name of the characteristic
g_s_range-chanm    = 'ZSTARTTS'.
* --- including or excluding condition ?
g_s_range-sign     = rs_c_range_sign-including.
* --- comparison operator
g_s_range-compop   = rs_c_range_opt-equal.
* --- low value
g_s_range-low      = zstartts.

APPEND g_s_range TO g_t_range.

* /BIC/ZSTARTTS WITH SAME JOB
CLEAR g_s_range.
* --- name of the characteristic
g_s_range-chanm    = 'ZJOB_ID'.
* --- including or excluding condition ?
g_s_range-sign     = rs_c_range_sign-including.
* --- comparison operator
g_s_range-compop   = rs_c_range_opt-equal.
* --- low value
g_s_range-low      = ZJOB_ID.
* --- include into list of restrictions
APPEND g_s_range TO g_t_range.

* The reading module is called:
* for demonstration purposes, the result is retrived in packages
* of size 10.

DATA: g_end_of_data TYPE rs_bool,
g_first_call TYPE rs_bool.

* --- this variable will be set to TRUE when the last data
*     package is read
g_end_of_data = rs_c_false.
* --- this variable indicates whether this is an initial
*     call to the reading module or a follow-up call (which
*     simply retrieves already selected data)
g_first_call = rs_c_true.

WHILE g_end_of_data = rs_c_false.

    CALL FUNCTION 'RSDRI_INFOPROV_READ'
      EXPORTING
        i_infoprov             = zcube "'ZGL_C110'
        i_th_sfc               = g_th_sfc
        i_th_sfk               = g_th_sfk
        i_t_range              = g_t_range
        i_reference_date       = sy-datum
        i_save_in_table        = rs_c_false
        i_save_in_file         = rs_c_false
        i_packagesize          = 10
        i_authority_check      = rsdrc_c_authchk-read
      IMPORTING
        e_t_data               = g_t_data
        e_end_of_data          = g_end_of_data
      CHANGING
        c_first_call           = g_first_call
      EXCEPTIONS
        illegal_input          = 1
        illegal_input_sfc      = 2
        illegal_input_sfk      = 3
        illegal_input_range    = 4
        illegal_input_tablesel = 5
        no_authorization       = 6
        illegal_download       = 8
        illegal_tablename      = 9
        OTHERS                 = 11.

    IF sy-subrc <> 0.
      BREAK-POINT.                                         "#EC NOBREAK
      EXIT.
    ENDIF.

* add to result package
    LOOP AT g_t_data INTO g_s_data.
*    WRITE: / g_s_data-ZJOB_ID,
*             g_s_data-ZSTARTTS.
      g_s_data-zcnt_key = g_s_data-zcnt_key * -1.
g_s_data-0TCTDURTION = g_s_data-0TCTDURTION * -1.
*      APPEND g_s_data TO zzgl_zgl_c110_tt.
      APPEND g_s_data TO ct_gl_cube.
    ENDLOOP.

ENDWHILE.

EXIT.

ENDFUNCTION.

Appendix C : Other codes and Table types:

Conversion Exit FM codes is attached.
Screen shots of Table Type and Line Type.