Experiences with MS SQL Server’s Implementation of the Column Store and SAP BW
In this post I want to share my experience with the implementation of the column store on the Microsoft SQL Server and Microsoft’s port onto SAP BW. This post will not describe the technical information like the setup or implementation. There information are already well described in blog posts of Martin Merdes as well in SNotes.
This blog post is structured as the following:
- Overview of the implementation of the column store from Microsoft within SAP BW
- Testing Mehods and results
1. Overview implementation column store
Since there are already good descriptions available, I will give only a very short overview. For a detailed description I recommend the following links:
- Optimizing BW Query Performance 2013
- SQL Server 2014 Columnstore: Released for SAP BW 2015
- Concepts of SQL Server 2014 Columnstore 2015
- Columnstore Optimized Flat Cube in SAP BW 2015
- SQL Server 2016 improvements for SAP (BW) 2016
- Simplified and faster SAP BW Process Chains 2016
- Recent SAP BW improvements for SQL Server 2017
- BW Queries by factors faster using FEMS-pushdown 2017
- Performance evolution of SAP BW on SQL Server 2017
- Customer experience with SAP BW FEMS-Pushdown 2017
- Improve SAP BW Performance by Applying the Flat Cube 2018
- Columnstore became default in SAP BW 2018
The column store within MS SQL Server is nothing new. The implementation is delivered with the MS SQL Server itself. Martin Merdes and his team port the implementation to SAP BW. A column store implementation exists besides Microsoft’s SQL Server also for DB2 (SNote 2301464) or Oracle (SNote 2150530/SNote 2351252).
In context of SAP BW the column store is used for cubes. There are also ways to use the column store within an ERP system.
The practical implementation can be read in the detailed posts of Martin Merdes or in the SAP SNotes.
Column Store Index
In this implementation the index replaces the index of cubes. The B-Tree index is replaced by a column store index. The column store index is smaller than the B-Tree index, so that reduction in of the required storage space is expected.
There are different versions of the column store index depending on the version of MS SQL Server. In the beginning with MS SQL Server 2012 the column store index was not update-able. The column store index could only be applied to the e-fact table. It is recommended to use the latest version of the MS SQL Server.
The flatcube is very similar to the HANA optimized cube. A flatcube has besides a dimension for the requests no other dimension. There is also only one fact table. The SIDs are stored directly in the fact table (comparable to a line item).
Onced converted into a flatcube, the cube cannot be loaded into a BWA. It is always possible to convert the cube back to a standard cube.
Comparison Dimensions Standard
A look into the ABAP dictionary shows the different amount of objects. This makes it clear, that the flatcube has a lower number of objects.
Objects in the ABAP dictionary (flatcube top – standard cube bottom)
The FEMS-Pushdown is a mechanism provided by a new statement generator by Microsoft. The query statements are split up thus a parallel processing within in the database is possible. In order this to work a query has to have at least two FEMS. FEMS stands for Form EleMent Selektion. A FEMS can be a key figure with a specific selection – for example a sales number from a specific customer and region.
With the standard statement generator query filters are applied quite late during processing. Thus it can happen that a large amount of data is transferred to the application server, before filters are applied. On the application server the data gets filtered and aggregated.
With the FEMS-Pushdown the calculation and filtering are pushed into the database to a bigger degree. Therefore the overall query performance can be improved. The FEMS-Pushdown can only be used in combination with a flatcube.
The following table shows an extreme example.
|event id||event text||duration in seconds||counter||% of overall duration||duration in seconds||counter||% of overall duration|
|19900||3.x Analyzer: Server||2,766||46.262||2,50%||0,013||37||1,40%|
|3110||OLAP: Data Selection||58,352||0||52,75%||0,009||0||0,97%|
|3100||OLAP: Read Data||0,005||10.646||0,00%||0,005||4||0,54%|
|3200||OLAP: Data Transfer||3,907||115.901||3,53%||0,006||70||0,64%|
2. Testing and results
- In the beginning the prerequisites and the different variants were not totally clear.
- I recommend to read carefully the SNotes as well the blogs from Martin Merdes (links at the beginning of this blog post).
- The setup was easy. The SNotes could be implemented without issues.
- I recommend to set the central SNote of the column store as a favorite. New SNotes with improvements and bug fixes are regularly released.
- The setup and the conversion of a column store index or flatcube is easy to understand. But it still requires some knowledge (where to click / reports).
- If the column store is going to be used in a productive system, I recommend a brief training for your colleagues, especially those working in Application Management Services.
2. Loading Performance
In order to evaluate the loading performance several cubes from different application were loaded with a total of over 520 million rows. The graph shows that flatcubes have a big advantage regarding to the loading performance. There are no dimension tables for flatcubes which costs time to build.
overall average loading time per 1 million rows
In the graph below there is an excerpt of four applications. In general there are some variances. Application 3 is a relatively small application with slim cubes. The load times for flatcubes within the application 3 can be a disadvantage. The cubes of the other applications contains between 100 and 200 fields.
loading time in per application per 1 million rows
3. Storage Comparison
Regarding to the storage, massive reduction can be achieved through applying the column store index and even more by applying flatcubes. The average reduction for the column store indexes is 60% and for the flatcubes 87%. The storage reduction in application 6 has been compared with a “normal” cube and a SPO.
4. Reporting Performance
The standard cube is excluded in the comparison of the reporting performance. The standard cube was slower than the other three variants – with the exception if two queries. The three variants are the following: flatcube, fems pushdown and BWA. The time was measured within the transaction RSRT. The cache was not used during execution.
reporting performance I
reporting performance II
The numbers shown in the graph are relative factors based on the execution time of the flatcubes. Therefore any number higher than 1 indicates a worse performance and any number lower than 1 indicates a higher performance.
Conclusion on reporting performance:
- In most cases the flatcube is faster than the BWA.
- The fems pushdown can accelerate a query enormously. On the other side the performance of the fems pushdown can be worse than even a standard cube.
- In some cases the BWA is unbeatable. The BWA is able to execute special functions to calculate some key figures in-memory. The other variants rely on the rather slow application server. An example of a function is the elimination of internal business volume (in German: Binnenumsatzeleminierung).
In general there are two recommendations regarding the reporting performance:
- If the application is using none to only a few master data objects, such as attributes or hierarchies, the fems pushdown is in 9 of 10 times the fastest and can be applied in most cases without further performance testing.
- With heavy usage of master data and and a complex data model, the usage of the fems pushdown has to be evaluated query by query. (The usage of the fems pushdown can be activated per query.)
The column store implementation from Microsoft is a feasible solution for reducing storage requirements and accelerating reporting performance. Especially if there is no BWA or upgrade to SAP HANA not yet planned.
The column store index can be applied at any time. There will be a reduction in storage need and at least a slight bump in reporting performance.
The flatcube can be used for every cube, that is not loaded into a BWA. The reduction in storage requirements is the highest with the flatcube. In general the reporting performance is better than the standard cube or the cube with the column store index. The big advantage is also the reduction of loading time.
The fems pushdown has to be assessed for every query. There are cases in which the new statement generator produces a sub-optimal statement. It is therefore recommended to always use the lastest SQL Server version in order to use the latest improvements.
In our case the column store is used to reduce the amount of data in the BWA, and acts as a fallback solution and substitute in cause there is a bigger problem with the BWA.
We could solve a problem within one application. The loading time with the standard cube was approximately 10 hours. With the flatcube the loading time could be reduced to 1 hour.
At last a big thank you to Martin Merdes who actively supported me.
thank you for the details ?
Thanks Corvin . ive been very impressed with the SQL Flat Cube /column store performance for both update and reporting.
Its data cacheing (for updates from DSO To Cubes) is also very impressive.