How to utilize your SQL Skills in SAP BW – Part 2 – Understanding Query Execution
Continuing on the topic of SQL Skills from me previous blog…. Another very useful tool to analyze query run time performance… the run schedules for a query.
Whenever users complain that queries are running slow – it feels despairing to not know what exactly is causing the issue. In many cases all the places we look for keep returning blanks and on one hand you have a DBA who is interested in the query that we are hitting and on the other hand the application level people (including me) who have no clue as to what exactly the DBA wants.
Continuing on my storyline….
Whenever you talk to any person in SAP BI, the amount of SQL discussed was very low. Also the same was noted by my colleagues in the data warehousing fraternity where most of the talk would be on joins, costs, indices etc and we would talk about free characteristics and query builder.
This led me to build a sort of preferred abstinence from SQL with a view that when you have a front end, SQL can be given a pass and this is what I observed in many too.
But then wait … SQL is important … because ultimately that is the language your database speaks and your database holds the data. So why am I thinking that SQL is important for SAP Business Warehouse?
A. Query performance tuning:
I am not going to talk about query performance tuning since that is quite extensively discussed and need no further deliberation.
A query written in Query builder is ultimately an SQL query which is fired against the database. I think it is common information that the SQL Query can be viewed by using the trusty RSRT.
Execute + Debug gives you this Window.
What is Display Run Schedule ?
When you run queries against a database, the database will try to optimize the access times for the same using its own Optimizers. A cost is established for each line of the query and the total cost for the same is computed. The run schedule also tells you which joins are costly and which access / query component is taking most of the time.
The run schedule looks something like this:
I have used a query on a DSO – you will find more interesting stuff when you query on a cube where you can see the joins on cubes and hash joins and temporary tables for multiproviders.
I am running this query on my Development system with no data on it and hence the low cost and number of rows.
The Rows highlighted in Orange are the tables being accessed / Indexes being accessed.
Clicking on them will open a popup for detailed statistics on the same. These are the same statistics that you run when loading data and or collect using tools like BRCONNECT for Oracle.
It gives you the last date of analysis and the currency of statistics. If you have statistics that are old then you can use the Analyze button to regenerate the same on your own – though it is recommended to be done by BASIS / DBA roles for want of authorizations and scheduling.
This is where run costs come in handy especially to identify trouble spots in your query much before the user can do so and thus making sure that if at all there is an issue then it is not with your database.
The most critical part is to see if your aggregates and indices are being hit. The “~” symbol usually represents an Index and the aggregates are represented by their F Tables and if an aggregate is being hit then you will not find any references to the F table of the cube.
Also the currency of the statistics are very crucial for calculating the costs and ensuring that alls well in the world of the database.
Statistics is too huge a topic to write about alongside here– will cover it in later blogs…
Pardon me if I have talked about Oracle databases only since so far I have come across Oracle Instances only and would love to see what Max DB , SQL Sever ,DB2 etc have to offer
The obvious next thing would be to talk about the SQL Query that is generated when executing a query and finding more about what exactly happens — will address the same in a separate blog whilst keeping things concise…
I have highlighted what I usually notice and look for when presented with bad query performance – since by design – all the obvious choices of aggregates etc are exhausted and you need some additional juice to find out what is happening…
In case there are any additional goodies that I might have missed please add to the comments section…