In my previous post, I wrote about how we adapted our portal usage analytics tool Click Stream to run on a HANA database. We had the feeling that a switch from a traditional database to SAP HANA would speed up the application, but with the little volume of data in our development environment, everything was fast anyways. So we decided to take real production data from one of our larger customers and load them into our HANA box in order to compare the performance.
A global consumer electronics and health care company was so kind to provide us with more than one year of usage data from a portal with approximately 70.000 unique users per month from 75 countries. The Click Stream database has 55 tables and the longest table with approximately 100 Million records in this case was the relation of logins to the groups of the logged in user.
We received the data as an Oracle dump and first loaded them into one of our Oracle databases. We found out, that the import options into HANA are still a bit limited. ODBC worked just fine with some tools, but not with others. The only ETL tool that is officially supported by SAP is data services, but we did not have data services installed back then, so we decided to use ordinary CSV files instead. While the Oracle dump was transferred from the customer’s network into our network, we prepared all the scripts for the CSV import.
On Oracle side we used the built in functionalities of the free SQL Developer tool to export data in a UNICODE format. It took approximately 60 minutes to create all the flat files. After that it took approximately 10 more minutes to copy the flat files into the file system of our HANA server. We created all the database objects e.g. tables and constraints using a script, which is part of the Click Stream package and typically executed automatically once you connect the application to an empty database. The actual import of the data was done with the command IMPORT FROM FILE. To run this command, we connected to HANA via SSH. The total import took approximately 30 minutes. Before the data were ready to use we had to create sequences, which was done in approximately one minute. So all in all the data transfer from Oracle to HANA took less than two hours.
We don’t have full details about the customers Oracle hardware, but it was a full blown production system with 45GB of RAM and a couple of multi core CPUs. For our own development system we requested proposals from six vendors. In the end we picked Dell due to the responsiveness of their sales team as well as the low price of the server and installation service. The HANA server is part of the Dell R910 series and comes with two 10core CPUs and 128GB of RAM.
As a basis for our performance comparison, we used statistical data from the customer’s productive Oracle database. They had launched approximately 10.000 analytics in Click Stream and on an average each of the analytics took about one minute to calculate. For each of these reports, an audit table within the database told us the exact milliseconds of runtime as well as all the settings and selected time frame. So we were able to rerun the exact same reports on the exact same dataset on HANA. The result was a performance increase of factor 45 on average.
What does this mean to a user? If a report runs more than 5 minutes on Oracle, it is calculated in 7 seconds on HANA. This makes it easy to interactively explore the portal usage. If you see room for improvements in one report, you can directly open up other analytics to check how to achieve the improvements. If e.g. you find out that a user needs more than 3 clicks on an average to find the relevant content in the portal, you can optimize the information architecture easily by just looking at the favorite exit pages. If one of those pages is difficult to find, just promote it as a link on the landing page or on the corresponding topic overview page of your portal.
Responsive real time analytics allow you to get deep and interactive insights into what is happening in your portal! And if you’re curious about gathering stats on your SAP Portal usage, check out Click Stream by Sweetlets.