Skip to Content
Author's profile photo Former Member

Multisource Universe

Executive summary

Data Federation is considered a next generation data warehouse where the physical need of a warehouse is bypassed. It is a methodology that takes a layered approach for dealing with data from disparate sources. It provides on-demand alternatives to traditional EDWH. This paper shows how to implement this methodology using SAP BusinessObjects Data Federation, performance evaluation criteria considered, and lessons learned.

Scenario

The aim is to unify data from SAP BW and DB2 legacy on the fly, making remote data appear as if it resides in a single database and evaluate the performance of SAP BusinessObjects Data Federation query engine.

Approach

  • Maintain AS-IS tables in SAP BW and DB2 legacy mainframes.
  • Develop a Data Foundation/Business Layer using SAP BusinessObjects IDT.
  • Run queries to evaluate the performance of Data Federation query engine.
  • Trace query run times from Data Federation admin tool and respective databases.

Pic.png

Building a multisource universe

Step 1: Create connections

  • Login to Repository through IDT
  • Right click on “Connections” folder and select “Insert Relational Connection” to create a connection to DB2

Pic.png

  • Provide a name for the connection                   

  Pic.png

  • Select “JDBC Divers” for DB2

  Pic.png

  • Provide the system details

  Pic.png

  • Create a relational connection for SAP BW
  • Provide a name for the connection

Pic.png 

  • Select “SAP Java Connector (SAP JCo)”

  Pic.png

  • Provide server details

  Pic.png

  • Create connection shortcuts in local project folder

  Pic.png

   

Step 2: Create Data Foundation

  • Right click on Local projects → New → Data Foundation

Pic.png

  • Provide a name for Data Foundation

Pic.png

  • Select “Multisource-Enabled” option

Pic.png

  • Click Next

Pic.png

  • Select the connections for the Data Foundation

Pic.png

  • Click Next

Pic.png

  • Click Next

Pic.png

  • Select the tables from respective connections

Pic.png

Pic.png

  • Do the necessary joins and modeling

Pic.png

Intelligent queries for union data sets

Intelligent queries can be triggered to query a respective source system based on the input parameter values. For BusinessObjects to act intelligently, a derived table should be created to do a union of data from both source systems; a mandatory parameter should be created and the query should be hard-coded with the mandatory parameter LOVs in where condition.

In this scenario, the few facilities data is available in SAP BW and few facilities data is available in DB2. For query to be triggered intelligently, we have created a derived table (DB2 View) to union data sets from both the source systems, a mandatory parameter and have encoded the LOVs in the where condition of respective source system queries.

When the end user inputs a facility value, it checks both the queries and runs only the query which has that value in where condition. Below are the conditions that have been included as part of where condition for queries to behave intelligently.

DB2: Table__5.”CUST_CD” in (‘DCT’,’YRK’,’ATF’,’CRP’,’GTX’)

BW: Table__9.”Z1A4GSTO” in (‘PTI’,’RTF’,’VRF’,’HDF’,’IVC’)

Step 3: Create business layer

  • Right click on Local projects → New → Business Layer

Pic.png

  • Select Relational Data Foundation and click Next

Pic.png

  • Provide a name for the Business Layer

Pic.png

  • Select the Data Foundation by clicking the button highlighted below

Pic.png

  • Select the Data Foundation

Pic.png

Performance test conditions

Below are the On Demand and Scheduling performance test conditions considered:

  • All facilities data for one month duration (A1M)
  • All facilities data for one week duration (A1W)
  • High-volume facility data for one month duration (H1M)
  • High-volume facility data for one week duration (H1W)
  • Medium-volume facility data for one month duration (M1M)
  • Medium-volume facility data for one week duration (M1W)
  • Low-volume facility data for one month duration (L1M)
  • Low-volume facility data for one week duration (L1W)

  

Testing results

Overall          

  • 49% of Time spent in BW
  • 39% of Time spent in BOBJ
  • 12% of Time spent in DB2

Pic.png

                                                            Overall Run-times

On-demand

  • All facilities one month data (2.2 M): ~18.25 (BOBJ 12.5 mins)
  • All facilities one week data (0.54 M): ~7.25 mins (BOBJ 3 mins)

Pic.png

                                                  On-Demand Run-times

Schedule

  • All facilities one month data (2.2 M): ~ 16 mins (BOBJ 9 mins)
  • All facilities one week data (0.54 M): ~ 8.25 mins (BOBJ 4.5 mins)

     Pic.png

                                                                           Schedule Run-times


Lessons learned

Union

  • To trigger intelligent queries, a pre-defined list of values must be hard-coded in the derived table.
  • Data types from each source system must match. Otherwise, workaround of “calculated column” must be implemented.

Data Modeling

  • Optimize the data model in multisource universe so that only single query processing happens per source system to get better execution time.

Performance

  • Sequential query processing on respective databases (BW, DB2 etc.).
  • Only simple SQL queries with “where clause” are pushed to respective databases out of various complex queries generated by SAP BOBJ multisource universe.
  • SQL queries with complex operations like “Group By,” “Having,” and “Case” etc. are processed in SAP BOBJ Data Federation query engine.
  • Background scheduling of Web Intelligence reports should be leveraged for large data volume. On-demand report execution should be leveraged whenever smaller data volume is required.
  • Optimization steps should be carried out on the respective source systems to have better query/report performance.

Appendix

System configuration

  • Total of 7 VM Guest running Red Hat Linux RHEL5
  • Web Tier — Tomcat — 2 VM Guest — 4 x Intel Xeon 5670 @2.93 GHz, 16 GB RAM
  • Intelligence Tier — 2 Node CMS Cluster — 2 VM Guest — 4 x Intel Xeon 5670 @2.93 GHz, 16 GB RAM
  • Processing Tier — 3 VM Guests — 4 x Intel Xeon 5670 @2.93 GHz, 24 GB RAM
  • Disk Storage — 200 GB shared IFRS/OFRS

Pic.png

                                                                 BOBJ Landscape

Data Volumes

System Records (Millions)
SAP BW 79.12
DB2 95.13

Acknowledgements

Santosh Taware has co-authored this paper. He is a Manager with Deloitte Consulting LLP. He has performed several full lifecycle implementations. His project experience highlights his leadership and solution architect skill as well as his technical skills in data modeling, back-end design, administrative workbench, and SAP data extraction. Santosh can be reached at staware@deloitte.com.

Assigned Tags

      2 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Former Member
      Former Member

      Hi Vishal,

       

      Could you please tell how the performance testing was done ? Is it via Data Federator admin tool ?

       

      Thanks,

      Karthik

      Author's profile photo Former Member
      Former Member

      hi santosh,

       

      while i am creating a connection with SQL,ORACLE as DB's ,while i am creating multisource universe using data federation i am unable to see connection  .

      getting error like : "this data source is not supported in multi source-enabled data foundations.