In our blog “BIA Changes Everything!” the SAP NetWeaver RIG announced to publish several topics around the SAP NetWeaver BI Accelerator.
As this is the eighth in a series of BIA blogs being developed by the SAP NetWeaver BI RIG, I hope you continue to find value and provide comments. Prior to reading this blog, please also consider reading the seventh blog, “Disaster Tolerance and Backup And Recovery for BI Accelerator”, if you haven’t read it already,
The specified item was not found..
In this blog we would like to draw your attention to some new features, improvements and major parameter changes that have been introduced for the BI Accelerator (BIA) with the latest revisions and SAP notes for BIA release 7.00. They have been communicated via SAP notes.
The default values mentioned in the SAP notes work well and in general do not have to be adjusted. However, in specific situations they allow for fine tuning of BIA.
- FEMS Compression (SAP notes 1074953 and 1118425; SAP BI 7.0 SP16; Revision 40)
- Package Wise Read (SAP notes 1002839, 1157582 and 1018798; SAP BI 7.0 SP17; Revision 44)
- Parameter max_cells_one_index (SAP note 1002839 and Revision 48)
- Enabling Parallel Indexing (SAP note 1158597 and 1023843; Revision 44)
- Sparse Compression (sparse search was introduced with revision 44; see also note 1152958 for revision 45)
- Reorg parameters, especially also for heterogeneous landscapes (SAP note 1163149; Revision 44)
1. FEMS compression
The database and the BI Accelerator can return data with different levels of detail. An important difference is that BIA can return data that takes into account the structure elements. This means that BIA can return information about the structure element (FEMS = Formula Element Selections) to which a certain value belongs. The database explodes the structure elements and returns single records that must be assigned to the correct structure elements by the OLAP.
With notes 1074953 and 1118425 the BI Accelerator usually returns fewer data records than the database. The amount of data read due to the convex hull is usually significantly larger than the union of the FEMS. The result set of the database is only very occasionally smaller than that of BIA (when a very large number of records qualify for many different structure elements).
Sorting in structure elements (which uses a lot of resources) never takes place in the BIA scenario. With the improvements in note 1074953 (BW-BEX-OT-BIA: FEMS, BIA optimization) and note 1118425 (BC-TRX-BIA: BIA 7.00: Activate performance feature FEMS compression) the compression of FEMS as optimization in the BIA part (merge of duplicate structure elements in one structure element with information to which structure element it belongs) and the OLAP part (being able to handle this information) has been implemented. It reduces the result set which has to be transferred from the BIA to the BI system.
FEMS compression has no impact on the size of BIA indexes.
The feature FEMS compression is set to “off” as a default value. You can switch it on by adding the following line in section [OLAP] in the configuration file “TREXIndexServer.ini” (see also SAP note 1118425):
fems_compression = on
There used to be a bug with revision 42 as well as a problem with non-cumulative key figures which is solved with note 1172709 (BI 7.0 SP19). SAP note 1239230 (BI 7.0 SP20) is also related to FEMS compression: If there is a BIA index for a non-cumulative InfoCube and RSDRI is used for access, this note has to be implemented.
Finally we deliver an improved memory handling for the TREX index server with BIA revision 48 which is mostly helpful if FEMS compression is used.
2. Package Wise Read
The data volume that is read from the BIA server is too large. The problem is not the data volume on the BIA server, but the volume of data that is transferred from the BIA server to the BI system after the query filter conditions have been applied and the aggregation has been carried out (see also SAP note 1018798). There are two important structural differences between the BIA server and the database:
1) The data is transferred using a Remote Function Call (RFC). This interface is not designed for transferring mass data.
2) The data is transferred in one package. It is not read in small packages (status of development before BI SPS15 (BI SP17) and BIA revision 44), as it is for the database where this happens cursor-based.
With BI SPS15 (BI SP17) and BIA revision 44 SAP delivers a possibility to read data package-wise via the RFC from the BIA server.
With this also large data volumes can be read from the BIA (with the performance of the RFCs). So this capability will make sure that if the query runs without memory problems against the database, it will at least also run against the BIA. When computing a query, the BI requests query result sets from the BI Accelerator.
Without package wise read huge query result sets transferred via RFC could lead to memory problems and performance decreases.
The size of the query result sets sent back from the BIA to the BI can be customized now to suitable packages by the BIA feature package wise read. BI reads the BIA results in packages of configurable size (= chunk_size) as long as the BIA indicates to the BI that there is still data to be retrieved. The chunk_size basically limits the number of records multiplied by the number of key figures to be transferred from the BIA server to the BI in one package.
The functionality will not eliminate all cases, where queries dump due to memory problems in the BI application server. In this case change your query definition accordingly e.g. by a split of the query into several queries with smaller result sets using more filter criteria. This also leads to more readable results for the user. We recommend that you use the BI as a real OLAP tool, which means to filter and expand the query as required. If you cannot reformulate and change the query to reduce the data volume that is to be read, you can deactivate the BIA server for this individual query. Call transaction SE16, table RSRREPDIR and specify the technical name of the query as COMPID to select the relevant records. Change field NOHPA from ” ” to “X”.
Note that the performance of such a query should generally not deteriorate significantly if you switch to the database. This is because the main work is the processing in the OLAP processor and notthe data selection.
To enable package wise read and in order to customize the package size set the parameter “chunk_size” in the section [olap] of the TREX Index Server configuration file to the recommended value of “2000000” see SAP note 1157582. The value “0” means that no packaging will be done which corresponds to the default setting. If the values of parameters “max_result_size_one_index” and “max_result_size_merge” according to SAP note 10002839 are raised too large the query might be successful in the BIA server when using the package wise read but may cause memory issues on the BI server.
SAP note 1159305 is related to the package wise read functionality: Wrong Data can occur for non-cumulative data when package-wise data reading of BIA data is switched on which is solved with this note.
3. Parameter max_cells_one_index
In the BIA standard configuration the maximum size of the overall result set of a BIA query is limited by the following parameters (until revision 47):
“max_result_size_one_index” = 1300000
“max_column_number_one_index” = 3
The value of the parameter “max_result_size_one_index” determines the maximum number of lines in the result set from one (potentially split) index. The value of the parameter “max_column_number_one_index” determines the minimum number of columns the result set must have for the parameter “max_result_size_one_index” to be evaluated. If both conditions match, the error 6952 is returned (“AttributeEngine: not enough memory.”).
Those two parameters restrict the size of the final result of the query in the TREX index server.
Given the default settings of max_result_size_one_index = 1300000 and max_column_number_one_index = 3:
Should the result set have three columns or more (number n), then the query will return at most 1300000 records, corresponding to n times 1300000 cells. With n=100 columns, this might return 130 million cells. The value for the parameter “max_result_size_merge” determines the maximum number of lines to be merged (the aggregation results of all split indexes are collected and merged).
The value for the parameter “max_result_size_merge” determines the maximum number of lines to be merged (the aggregation results of all split indexes are collected and merged).
The value of the parameter “max_column_number_merge” determines the maximum number of key figures to be merged. If both conditions match, the error 6952 is returned (“AttributeEngine: not enough memory.”). Those two parameters restrict the size of temporary data structures, which would be generated during execution of join operations.
The default values are:
max_result_size_merge = 2000000
max_column_number_merge = 3
With BIA revision 48 the new parameter max_cells_one_index (TREXIndexserver.ini) replaces parameters max_result_size_one_index and max_columns_one_index to simplify the possible limitation of a query result size.
The new parameter max_cells_one_index (TREXIndexserver.ini) has a default value of 40.000.000 (cells is rows multiplied by columns). For details see SAP note 1002839.
4. Enabling Parallel Indexing
You can optimize the filling of BIA indexes on the BI and BIA side. For the optimization of the indexing procedure on BI side see also SAP note 1161395. If you are not able to use the full resource capacity on BI side (which could e.g. be the case if you are restricted to a small number of batch processes you are allowed to use or there are cubes to be indexed where each contains only one single big request) then you do not have to think about how to optimize the BIA indexing procedure on BIA side as then the BI is the restricting factor.
If the parameter parts_per_host was set to value “1” (= old parameter; new customizing via TREXAdmin tool see note 1163149) before revision 44 the indexing of an InfoCube was very slow as only one core could be used (for the prepare optimize and check sparse task).
In contrast multi-user and MultiProvider queries were potentially fast. If more parts_per_host were used, the indexing procedure was faster, as one core can serve one part. In contrast the query throughput of multi-user queries can be worse. Wait situations can occur as a single query uses all CPU cores when all index parts are processed in parallel.
Therefore the parameter “tmpx_threads” has been introduced in order to have fast multi-user queries with parameter parts_per_host set to value “1” as well as performance improvements for the indexing procedure through parallel indexing on BIA side.
The number of threads for indexing can be configured, which enables the TREX index server to parallelize the indexing of attributes of a BIA index see SAP note 1158597 (+ check parameter see SAP note 1023843).
Value 1 means single threaded. If you increase this parameter, more CPU cores will be used in parallel for filling each BIA index. If you do an initial filling of your BIA and you fill the BIA indexes sequentially, it is recommended to raise the parameter tmpx_threads to the value 4 for dual-core CPUs and 8 for quad core CPUs. We usually recommend setting the values as described above.
Nevertheless it could be necessary to reduce the number of “tmpx_threads”, e.g. to half of the CPU cores, should the executions mainly of the “prepare optimize” and also “check sparse” task be executed in parallel to a very high degree. But this really strongly depends on the organization of your BIA index loading procedures and has to be tested, evaluated and measured in the individual customer case.
5. Sparse Compression
Radical compression can often be achieved for tables containing sparse data. Sparse data contains very many null or repeated values, so that in a multidimensional cube representation the interesting data points are either scattered thinly over the space or clustered in relatively small subspaces. The BIA uses a number of additional compression techniques that greatly reduce the memory and disk footprint of sparse data. As data volumes are increasing significantly and main memory is a big cost factor, further compression of data is needed. With BIA revision 44 sparse search has been introduced (see also note 1152958 for revision 45).
6. Reorg parameters, especially for heterogeneous landscapes
Locality and parallelism are conflicting objectives. Locality avoids communication (join and search) and parallelism speeds up computation (aggregation). The limiting factors are memory and CPU per blade. In order to achieve the aim of high query performance and high utilization of the available resources with no hotspots and low amount of communication the reorg algorithms have been defined.
The partition algorithm aims at the following:
- Divide indexes into small and large
- Split large indexes into as many parts as there are cores in the landscape
- Distribute the large parts evenly to the landscape
- Distribute the small parts by the number of small indexes on each host
- For OLAP and Join indexes calculate Join clusters
- Put all indexes in a cluster on the same host
New functionalities of the partition algorithm are introduced with BIA revisions 44 and 47:
- Initial Reorg (revision 47)
- Reorg called after create index
- Optimization of the distribution in heterogeneous BIA landscapes (SAP note 1163149 and revision 44)
- You can configure these ‘Reorg’ parameters to optimize the load distribution of heterogeneous BIA landscape and to use the available hardware resources in an efficient way. As e.g. the Clovertown CPUs are much more powerful than the Irvindale CPUs, the distribution of index parts needs to be adjusted.
- Parameters: Multiplier: can be set; it determines the values of other parameters; Index parts: shows the number of index parts the split indexes consist of
- Example: 2 dual core CPUs per blade = 4 weighted cores and Index parts = 4; therefore the Multiplier = 1; 2 dual core CPUs per blade = 4 weighted cores and Index parts = 1; therefore the Multiplier = 0.25