Enabling cold store data access using view for external access (The ‘8’ View) for aDSOs in BW reports with mixed modelling scenario
As promised in my earlier blog post, in this article I will be explaining the possibilities of using the view for external access or the external SAP HANA SQL view for aDSOs or the generated ‘8’ view of aDSOs. The main idea is to gain optimum performance avoiding unnecessary access to cold storage when only Hot data is requested. For details about the ‘8’ view please have a look into the above mentioned blog post link.
For this illustration, I will use the same aDSO used for the earlier blog post : SALESADSO, which has some sample sales data for calendar year 2019 & 2020. Executing suitable DTO rule, I have moved all the data residing in partitions covering calendar year 2019 to external cold storage (IQ). So, the view for external access for the aDSO : /BIC/ASALESADSO8, has the COLD_STORE_FLAG column populated as : ‘X’ for all data belonging to year 2019 & ‘ ‘ for all data belonging to year 2020.
Now let’s think about consuming this view into a BW data model using mixed modelling architecture in such a way that unless a request has been made to cold store data, it should not access the cold store causing unnecessary performance penalty.
Meaning, we will build a calculation view based on the ‘8’ view & then consume the calculation view into a composite provider & finally a report on top of the composite provider. The purpose is to explore how can we determine the access to cold store data exposing the flag as a consumer of single value input variable in the application layer of reporting.
Now the challenge is that if we simply expose the field COLD_STORE_FLAG as a variable in the report through the HCPR, user can select either ‘ ‘ (blank or empty string, no whitespace) or ‘X’ if we build a single value input variable in frontend, which is the goal here, we could potentially have a multi input variable which can request both the data. But I will try to explain how to build a solution using single value input variable around this flag. The challenge with this approach is if a user, for example, requests data from December 2019 to February 2020 in report, & selects based on the flag values available with just the field exposed as it is, user can see either 2019 year’s data (if user selects COLD_STORE_FLAG = ‘X’) or 2020 year’s data (if selected ‘ ‘). But with selection of ‘X’ we need to secure that the user is able to access both 2019 & 2020 years’ data. This situation can be handled in multiple ways :
- By using multiple input variable where user needs to select both the values for COLD_STORE_FLAG in such situations.
- By handling it using an input parameter & necessary filter criteria in calculation view.
- By using a step 2 exit variable where for user input ‘X’ against flag value, the step 2 variable can pass ‘*’ (all) as a filter to COLD_STORE_FLAG column.
I will explain the second approach in this blog post, which is handling of the flag completely in calculation view level. I would explaiin two ways of doing it :
Approach#1 : using single input parameter :
First step is to consume the view in a calculation view in below way:
I have made 2 nodes out of the view called HDB & HDB_IQ. In HDB I have placed a filter on COLD_STORE_FLAG = ”. So, it contains only data from the hot data store (Active Data Table of aDSO), while the other node has no filter, containing all the data combining hot & cold storage.
In the node Union_1, I have created a new target column : CS_FLAG, with custom constant mapping of the node values as ‘ ‘ (HDB) and ‘X’ (HDB_IQ). This way the pruning is secured.
I have created an input parameter called IP_CS_FLAG (Access Cold Store) as optional single input of type VARCHAR & length 1. This is to pushdown the filter with flag value passed from application layer on access of the data tiering storage by using a filter expression.
If no value is passed through it, the takes the unassigned value, thus only requesting the hot store data avoiding the extra traffic to access the external cold storage.
Coming to the fact that the COLD_STORE_FLAG can have 2 possible values with enabling Cold Storage & movement of some data into it ie. ‘X’ for cold store data & ‘ ‘ for hot store data.
Now in our reports, we want to have a universal variable which enables users to select if
- They want to access data from cold storage(‘X’) – this may lead to some performance penalty
- Or they just want to access the recent data residing in Hot storage of HANA nodes(‘ ‘).
For securing no data loss while selecting the variable as ‘X’, we need to access both HANA & IQ (Cold store) data when user selects the variable value as ‘X’, to handle cases like accessing data from a period which overlaps between hot & cold storage.
The above scenario is handled using the filter in node ‘Aggregation_2’ using a filter on the created target column : CS_FLAG & restricting it on the input parameter value.
The filter expression basically reads the input parameter value passed, if it comes with a value ‘X’ (user requests data from both Hot & Cold store), we set the CS_FLAG to go through the HDB_IQ node as per the definition of the field pruning & both 2019 & 2020 years’ data are requested, if the input parameter is not fed or filled with blank value ie. ‘ ‘, then according to the constant mapping it should only look through the HDB node with ‘ ‘ values for the flag, ensuring only Hot store data is selected & no cold store table is accessed causing unnecessary performance penalty.
Now, let’s just make a select query on the calculation view without passing any input parameter value :
We can see it only picks the data relevant for year 2020 where the COLD_STORE_FLAG is blank. A planviz shows that the executed plan hits only the Active data table residing in hot store.
Now, let’s execute a select query on the same again but this time with the input parameter value filled in as ‘X’.
And this time it shows all the data irrespective of the flag value both from hot & cold store for year 2019 & 2020. The planviz shows it hits both cold store table & active data table this time.
Approach#2 : Using double input parameters
Create a calculation view with two input parameters. One input parameter for taking the user input as to determine if access the cold store is needed or not & the other one is to derive the actual filter value to pass for COLD_STORE_FLAG column.
In this case, we have IP_USER_INPUT input parameter to intake the user input, with default value blank & the other one is IP_CS_FLAG, which is derived using a stored procedure. The stored procedure has the input parameter which is fed by the first input parameter & then if user selects ‘X’, it passes ‘*'(both Hot & Cold Store should be accessed), else blank(only Hot data store is accessed).
Now let’s create a procedure which can take the IP_USER_INPUT as input parameter & populate & return the value of COLD_STORE_FLAG as a feeder to IP_CS_FLAG.
The logic of which is as below :
CREATE PROCEDURE "ZCSFLAG" ( IN IP_USER_INPUT NVARCHAR(1), OUT IP_CS_FLAG NVARCHAR(1) ) LANGUAGE SQLSCRIPT READS SQL DATA AS BEGIN IF :IP_USER_INPUT = '' THEN IP_CS_FLAG := ''; ELSE IP_CS_FLAG := '*'; END IF; END;
Now use this procedure as the derivation factor for IP_CS_FLAG.
Now we need to connect the input parameter IP_USER_INPUT to the same of this procedure. This can be done in semantics of the calculation view under tab : Parameters/Variables from Input Parameter Manage Mapping option as shown below :
And then in the lowest Aggregation node we filter the COLD_STORE_FLAG column by using a match function with the IP_CS_FLAG.
And it picks only those 9 records pertaining to year 2020 from hot data store.
Approach#3 : which does not support proper pruning of maneuvering query to right data tiering storage
However, I tried another exercise with one input parameter & the IF-ELSE handling in the MATCH funtion using filter instead of doing the same in a stored procedure. The results were same, but the difference was in prunning.
Let’s assume, I have just one input parameter (VARCHAR(1)) in the calculation view & the default value is blank.
Now in lowest Aggregation_1 node, the filter on COLD_STORE_FLAG is like below :
if('$$IP_CS_FLAG$$' = 'X',match("COLD_STORE_FLAG",'*'),match("COLD_STORE_FLAG",'$$IP_CS_FLAG$$'))
The filter expression basically reads the input parameter value passed, if it comes with a value ‘X’ (user requests data from both Hot & Cold store), we set the COLD_STORE_FLAG to show all values using a match function with ‘*’ , other values than ‘X’, which is ‘ ‘ should only match with ‘ ‘ values for the flag, ensuring only Hot store data is filtered & selected for output.
But the observation from planviz is that it accesses both Hot & Cold store tables even when a blank value is passed.
In terms of performance it is not good if the query hits the external cold store table even if only Hot data is requested & fetched as output. So, the last option is not something we should aim for.
Our job now is to consume this calculation view along in a composite provider & then build a report with the variable based on the input parameter. Let’s create an masterdata info object for holding the input parameter possible values for browse help. I created one called CS_FLAG with type CHAR length 1, the master data table of which looks like below :
So, on browse help in report, it will only show the values as unassigned & ‘X’ to choose from.
As next step, the HCPR : CP_SALES is created & this info object is assigned to the input parameter in HCPR output tab (IP_CS_FLAG in Approach#1 & IP_USER_INPUT in approach#2)
The next step is to create an input ready single value variable on this info object CS_FLAG to restrict the input parameter in the report using the same.
Now simply create a report with meaningful rows & columns dragged into. I have created a sample report : ZQ_SALES_TEST as below just to show the behaviour of the report data controlled by the flag with a single variable in selection prompt screen which was built on CS_FLAG.
Let’s now have a look at the selection prompt screen of the report :
So, as mentioned earlier, I have just included the data tiering controller flag variable to showcase the output variance. As expected from the masterdata entries, the value help shows only two possible values to choose from.
Now let’s first execute the report without having any value passed to this variable, since it is optional.
We can see only 2020 data is shown back, which is in the hot storage.
Now let’s do it again with the variable value passed as ‘X’.
And this time we get data from both hot & cold storage ie. both 2019 & 2020 year’s data.
Hope this blog will help gaining some insights about the generated ‘8’ view for aDSOs being consumed in mixed modelling scenario from BW/4 reporting perspective & securing correct control of data across different data temperature tiering systems with optimum performance in mind.
You are welcome to share any comment/feedback/experience regarding this topic for enriching our mutual knowledge bank 😉