Skip to Content

This blog looks at the situation of a customer who migrated his hand-crafted Oracle data warehouse to SAP BW. It is a tangible, real-world case that contradicts many of the alleged pros and cons of the differing approaches to data warehousing. Beside TCO, the SQL Access to BW via Data Federator allows even for a surprising 1:1 comparison regarding query performance.

There is a long running, almost religious debate on how to build up a data warehouse (DW), either using a DW management application – like SAP BW – or using a set of loosely coupled tools and manually crafting the management of that DW with an RDBMS as the central infrastructure piece. The latter often dilutes the difference between an RDBMS software suitable for high data volumes and a DW. Examples are the traditional RDBMS vendors like Oracle, Microsoft, IBM and Sybase but also the upcoming and/or specialized vendors like Teradata, Greenplum, Netezza or Vertica. Fundamentally, they offer an RDBMS. In order to build up a DW this RDBMS needs to be supplemented by a management software that is based on ETL and other tools (e.g. for data enrichment, data quality etc.). That software manages challenges like

  • Semantically correct extraction: Just moving the contents of a table in the operational system to the DW is often not sufficient; more tables need to be involved and not even a big join does the trick as data gets semantically enriched through application code; SAP ERP is a (typical) example of that – just consider the many infosets.
  • Coordination and scalability of data movement processes: That includes necessary transformations and locking. Just consider such trivial examples like normalizing dates such as 14.4.2009, 04/14/09, 14 Apr 09, … to a uniform representation like 20090414. Locking is necessary to protect that mass data movements expose or create inconsistent states in the warehouse.
  • Physical layout: Data needs to be stored somewhere and someone has to decide and manage tables or sets of tables, degree of normalization, organization and isolation of dimensions (which might then lead to stars or snowflakes) etc.
  • Consistency of the data: This is typically achieved by a transactional concept built on top of the DW entities which – in turn – maps to DB transactions.

In summary: the DW is the combination of an RDBMS with a DW management software. The latter can be created proprietarily (via consultant projects and/or via evolution over time) or a standard package such as SAP BW can be used. It is obvious that non-standard, hand-crafted approaches are likely to imply a high TCO as every change (to a data source or to a model) requires expert programmers to adjust the software. On the bonus side, it is often argued that a manually managed DW can be tailored to the individual needs and that there is so many more options that one can use, especially for query performance tuning.

One of SAP’s customers ran such a hand-crafted DW based on an Oracle RDBMS. Many of source systems were operational SAP systems. Extraction was proprietary. The system started small and grew over the years and with that grew the management complexity and consequently the TCO. In order to tackle the latter, BW was considered, particularly for the DW management advantages. End users accessed the RDBMS via Business Objects’s WebI using SQL as the query language. With the switch to BW the customer wanted to keep WebI as the frontend tool for his end users.

Considering SQL Access to BW via Data Federator there was suddenly a 1:1 comparison between (productive) queries running on an Oracle RDBMS versus the same queries running on a BW system that Understanding Query Performance in NW BI and BIA. The result can be seen in the chart below; each query is represented by an individual color. While the overall superior performance of BW and BWA is evident in the chart, it is important to note that fast queries under Oracle remain fast under BW/BWA and that slow queries under Oracle become fast under BW/BWA. The overall variance of query runtimes is minimal under BW/BWA leading to a stable performance pattern. Finally, the history of the compared systems suggests that the Oracle DW has received many many hours of manual tuning over the years while the BW/BWA system has hardly received any tuning. So, even here, the TCO aspect is apparent.

This is just an instance that provides the rare opportunity of directly comparing two widely differing approaches. In my opinion, this example also questions the value and the significance of assessing data warehouses by (only) its underlying RDBMS. See Gartner’s magic quadrant on DW DBMS as an instance of such an analysis. Here, for this customer, the RDBMS does not make the difference: his BW also runs on top of an Oracle RDBMS.

Query Performance Comparison
Click on chart to enlarge.

To report this post you need to login first.

4 Comments

You must be Logged on to comment or reply to a post.

  1. Stefan Koehler
    Hello Thomas,
    thanks for sharing this experience with us.

    If i understand you correct, the graphic shows the same queries – the left one through the SQL interface of SAP BW and the right one through SAP BIA.

    In this case the performance comparison doesn’t really fit, because an I/O bound system (= directly accessed RDBMS) is compared with a memory bound system (=BIA).

    We self are using the SAP BW BIA and it increases the performance drastically for some queries.

    In my opinion if we wanna compare those two “different” techniques we should do this with the SAP BW BIA and HP/Oracle Exadata, so we have a “comparable standard”.

    Of course Oracle Exdata is a way more expensive, but we save the hardware for the database server itself and for SAP BW BIA.

    It would be really interesting to see a measurement of these two giants (SAP BW BIA and HP/Oracle Exadata) .. do you have experience with it?

    Regards
    Stefan

    (0) 
    1. Witalij Rudnicki
      Hi Stefan,

      HP/Oracle Exadata is still database machine. I believe to have the comparison BWA should be compared to one of the other niche DW systems, like probably Sybase IQ or MS/DATAllegro. But frankly there are so many in-memory appliances, and MPP databases, and columnar databases, that I am lost who is what 🙂

      Thomas, thanks for sharing! It is valuable info.

      Regards,
      -Vitaliy

      (0) 
      1. Stefan Koehler
        Hello Vitaliy,
        yes you are right that the HP/Oracle Exadata or the whole infrastracture HP/Oracle Database machine is still a database. But it is working different than a classic database, because of the storage servers are performing the SQL and returning only the needed rows to the database server itself. So the I/O bound bottleneck of normal databases is nearly “eliminated”. (additional to the InfiniBand performance)

        http://www.oracle.com/technology/products/bi/db/exadata/pdf/exadata-datasheet.pdf

        Maybe Thomas has more information for us about the measurement.

        Regards
        Stefan

        (0) 
    2. Thomas Zurek Post author
      Hi Stefan,

      no, the graph does not compare SAP BW on Oracle vs SAP BW on BIA! It compares a hand-crafted DW on an Oracle RDBMS (left) to SAP BW using BIA (right). You can probably put Oracle’s Exadata on the left-hand side and will still see a qualitatively similar result.

      Regards

      Thomas

      (0) 

Leave a Reply