A couple of days ago I was Lucky enough to receive an invitation to the SAP HANA Developer Access Beta program. For those of you not aware what that is, there is an excellent FAQ on SDN which you can find here:
The following paragraphs will describe 7 steps towards your first HANA reporting by using SAP HANA Sandbox.
Step 1 – Cloudshare
In your invitation mail there is a link to cloudshare.com. Create a new account, log on to the desktop and follow the following YouTube video on how to set up your own HANA system:
A great tip is given on how to use remote desktop instead of using the web version which comes with Cloudshare. This is highly recommended as navigation is much easier in the remote desktop.
Logon details for your RDP session are given on the Cloudshare page each time you activate the environment:
Step 2 – Getting big data
So you managed to set up a system and created a RDP session. Good job! Now to find some “big data”. In this article I’m using data which I downloaded from the guys over at Infochimps. I’ve used the following data sources which in total will give me about 20 million records.
- AMEX Exchange Daily 1970-2010 Open, Close, High, Low and Volume
- NASDAQ Exchange Daily 1970-2010 Open, Close, High, Low and Volume
- NYSE Exchange Daily 1970-2010 Open, Close, High, Low and Volume
The files contain stock exchange data. Great recipe for finding something interesting.
You will be getting a bunch of separate csv files. Use the daily prices ones. For simplicity sake I have merged all files into three separate csv files. Good old DOS can help you with that by using the following command:
copy *.csv importfile.csv
Make sure to execute the command in the same directory your files are placed. Replace importfile.csv with something recognisable (like AMEX.csv, NASDAQ.csv, NYSE.csv).
Step 3 – Create your table in HANA
And now the fun begins! You need to create a table which holds all your data records. Remember those files I downloaded from Infochimps? The files have the following structure:
That means I need to replicate that structure into my HANA table. You can create your table in the HANA studio using the modeler or by using SQL.
Please note that you should create the table in your own schema and use Column Store (to witness awesome speed later on).
I prefer SQL because it’s faster. The following command will create your table:
create column table “S0001432066”.”NASDAQ”(
“EXCHANGE” VARCHAR (10) not null,
“STOCK” VARCHAR (10) not null,
“DATE” DATE not null,
“PRICEOPEN” DECIMAL (15,2),
“PRICEHIGH” DECIMAL (15,2),
“PRICELOW” DECIMAL (15,2),
“PRICECLOSED” DECIMAL (15,2),
“STOCKVOLUME” DECIMAL (15),
“PRICECLOSEDADJ” DECIMAL (15,2),
primary key (“EXCHANGE”,”STOCK”,”DATE”))
Step 4 – FTP and import your files into the HANA system
The guys over at SAP will make you do a little treasure hunt in order to find the user id and password for the FTP server. Go into you HANA system and execute the following SQL statement:
Select * from SYSTEM.FTP_SERVER
Et voila, a username and password (masqued for obvious reasons):
Take note of what is mentioned on where to store the files. More specifically you should create a folder on the server equal to your SCN number (in my case S0001432066).
Fire off your favourite FTP client (mine is FileZilla):
Create a directory and store your files:
Take note that next to my files containing the data there is a so called “ctl” file. These files are required in order to be able to load data in your created (NASAQ) table. The files have the following content:
into table S0001432066.”NASDAQ”
record delimited by ‘
fields delimited by ‘,’
optionally enclosed by ‘”‘
error log ‘Text_Tables.err
NASADAQ is the name of my created table, AMEX.csv the file I will load.
If required, additional information can be found in this post:
Time to import your 20 million something records into HANA! Execute the following SQL statement:
IMPORT FROM ‘/dropbox/S0001432066/AMEX.ctl’
Note the name of the folder I created in step 4 (folder S0001432066), /dropbox/ is a prefix. After a while you will get the following result back:
Statement ‘IMPORT FROM ‘/dropbox/S0001432066/AMEX.ctl” successfully executed in 7:22.046 minutes – Rows Affected: 0
Hang on I hear you thinking. 0 Rows? No it’s not actually. You can check by firing off the following SQL statement:
select count(*) from NASDAQ
That looks promising! Let’s check some more:
We have data! Look at the log file:
Fetched 30 row(s) in 15 ms
Step 5 – Create an Analytic view for reporting
First step create a package which will hold your view:
Create a so-called Analytic View:
Give it a name:
Select your table:
Drag your objects into the attributes and measures sections:
Validate, save and activate your view. Well done! We can use this view in Explorer and Excel. Note that you can preview your data and even auto document if by using these buttons:
Important! In case preview fails it is likely you have to “grant” your schema by executing this statement:
grant select on schema s0001432066 to _SYS_REPO with grant option
Replace s0001432066 with your own namespace ofcourse.
As an extra step your could create an extra table which holds the stock names. If you follow the same procedure as for creating the table with the stock records, you can join your two tables and have texts together with the stock names. Sequence for this would be:
- Create and load your text table
- Create an attribute view
- Link your analytic view together with your attribute view
Result would be:
Step 6 – Using Explorer to report on your data
On your desktop a shortcut to Explorer can be found:
Fire it off and be sure to enter the correct server which can be found in your invitation mail:
Go to manage spaces :
Select your view and press “New”:
Give it a name:
Put in your objects:
And press ok! Don’t forget to index your data.
Refresh your page and you have a new information space:
Press it and start analysing! This one is for you Steve. Apple stock prices in the year he had to leave Apple:
Be sure to select enough data to get a nice trend.
Step 7 – Using Excel to report on your data
There is also a possibility to use Excel pivot tables for your reporting needs. Fire off Excel and connect to your view:
Select the MDX connection:
Enter server credentials (check your invitation mail if not known):
Select your view:
You now have your data in a pivot table format. Set some filters and analyse your data at great speed!
Note that data retrieval is at great speed, but Excel makes it a bit sluggish in a RDP session.
Many thanks for bearing with me till the end and good luck with this great opportunity to test drive HANA!