Skip to Content

Ever try to design a report using a large dataset?  Doing this can be a performance nightmare as you may not have applied the filters you wanted or the query to the database may take a long time to complete.  So how do we improve this?  One feature that exists in Web Intelligence to help with this is called Data Sampling.  This feature already existed in the product for many relational databases through the unx format and in SP05 support for this feature was added for relational HANA connectivity.

I had never used this functionality, so the first question for me was, where do I go to turn it on.  The answer is, within the query panel by pressing on the Query Properties button as seen below:

QueryPropertiesButton.png

Pressing this button brings up the Query Properties screen where there is a section called “Sample” which has two properties, “Sample Result Set” and “Fixed” as seen here:

QueryProperties.png

You may notice that the “Fixed” property is greyed out, this is because currently this functionality is not available for HANA.  What the fixed functionality provides is so that each refresh will return the same data each time.

My next question was, what does this sample result set actually do and how does it work.  I referred to the Web Intelligence User Guide which in section 2.1.3.7.3 tells us that it applies a database level restriction on the query.  So what does this really mean.  The first thought was, this must mean it is part of the SQL that is being generated, so I clicked on the view SQL button within the Query Panel, as seen here:

ViewSQLButton.png

This brings up the SQL Statement that will be run.  In my case, this is what I saw:

ViewSQL.JPG

What we see is that part of the SELECT statement, we have added TOP 500, as that was my Sample Result Size I had set earlier.  Additionally, at the end of the SQL statement, we see ORDER BY RAND().  If you set this property when you first see the query panel when creating a new document, you never need to wait for a large query to return, and you can come back anytime to uncheck or recheck this property as needed.

How does this differ from the “Max rows retrieved” setting?  This setting queries the database for all values, but then truncates the result that is actually displayed and is when you will see the partial results warning next to both the refresh status at the bottom of the report, as well as in the data tab.  To access the data tab, you click the button at the top right of the viewer as seen here:
DataView.png

On this screen we can see the following warnings related to these two settings which let you know if there are partial results, sampled results, or both.  The both scenario is when the sampled result size is greater than the max rows size.  Below are images for what you see for each of these scenarios.

Partial Results:

MaxRows.JPG

Sample Results:

DataSample.JPG

Both:

SampleMaxRows.JPG

I personally see this as a simple and great way to speed up the initial designing of reports.  Of course we would want to remove this setting before doing any final testing to make sure our system would be sized to handle the actual data being returned as well as verifying the calculations with the larger data are what you expect.

I hope that this blog was helpful and that you will try this functionality out when next designing a document.

Also, please bookmark our Webi Bulletin as it is a great page that is updated regularly with information on the Web Intelligence product.

To report this post you need to login first.

Be the first to leave a comment

You must be Logged on to comment or reply to a post.

Leave a Reply