SAP HANA Developer Access Beta program – get the best out of your test-drive!
A couple of days ago I was Lucky enough to receive an invitation to the SAP HANA Developer Access Beta program. For those of you not aware what that is, there is an excellent FAQ on SDN which you can find here:
http://wiki.sdn.sap.com/wiki/pages/viewpage.action?bc=true&pageId=257820641
The following paragraphs will describe 7 steps towards your first HANA reporting by using SAP HANA Sandbox.
Step 1 – Cloudshare
In your invitation mail there is a link to cloudshare.com. Create a new account, log on to the desktop and follow the following YouTube video on how to set up your own HANA system:
http://www.youtube.com/watch?v=9NKSYgt5eX4
A great tip is given on how to use remote desktop instead of using the web version which comes with Cloudshare. This is highly recommended as navigation is much easier in the remote desktop.
Logon details for your RDP session are given on the Cloudshare page each time you activate the environment:
Step 2 – Getting big data
So you managed to set up a system and created a RDP session. Good job! Now to find some “big data”. In this article I’m using data which I downloaded from the guys over at Infochimps. I’ve used the following data sources which in total will give me about 20 million records.
- AMEX Exchange Daily 1970-2010 Open, Close, High, Low and Volume
- NASDAQ Exchange Daily 1970-2010 Open, Close, High, Low and Volume
- NYSE Exchange Daily 1970-2010 Open, Close, High, Low and Volume
The files contain stock exchange data. Great recipe for finding something interesting.
You will be getting a bunch of separate csv files. Use the daily prices ones. For simplicity sake I have merged all files into three separate csv files. Good old DOS can help you with that by using the following command:
copy *.csv importfile.csv
Make sure to execute the command in the same directory your files are placed. Replace importfile.csv with something recognisable (like AMEX.csv, NASDAQ.csv, NYSE.csv).
Step 3 – Create your table in HANA
And now the fun begins! You need to create a table which holds all your data records. Remember those files I downloaded from Infochimps? The files have the following structure:
That means I need to replicate that structure into my HANA table. You can create your table in the HANA studio using the modeler or by using SQL.
Modeler:
Please note that you should create the table in your own schema and use Column Store (to witness awesome speed later on).
SQL:
I prefer SQL because it’s faster. The following command will create your table:
create column table “S0001432066”.”NASDAQ”(
“EXCHANGE” VARCHAR (10) not null,
“STOCK” VARCHAR (10) not null,
“DATE” DATE not null,
“PRICEOPEN” DECIMAL (15,2),
“PRICEHIGH” DECIMAL (15,2),
“PRICELOW” DECIMAL (15,2),
“PRICECLOSED” DECIMAL (15,2),
“STOCKVOLUME” DECIMAL (15),
“PRICECLOSEDADJ” DECIMAL (15,2),
primary key (“EXCHANGE”,”STOCK”,”DATE”))
Step 4 – FTP and import your files into the HANA system
The guys over at SAP will make you do a little treasure hunt in order to find the user id and password for the FTP server. Go into you HANA system and execute the following SQL statement:
Select * from SYSTEM.FTP_SERVER
Et voila, a username and password (masqued for obvious reasons):
Take note of what is mentioned on where to store the files. More specifically you should create a folder on the server equal to your SCN number (in my case S0001432066).
Fire off your favourite FTP client (mine is FileZilla):
Create a directory and store your files:
Take note that next to my files containing the data there is a so called “ctl” file. These files are required in order to be able to load data in your created (NASAQ) table. The files have the following content:
Import data
into table S0001432066.”NASDAQ”
from ‘AMEX.csv’
record delimited by ‘
‘
fields delimited by ‘,’
optionally enclosed by ‘”‘
error log ‘Text_Tables.err
NASADAQ is the name of my created table, AMEX.csv the file I will load.
If required, additional information can be found in this post:
How to load CSV files into HANA
Time to import your 20 million something records into HANA! Execute the following SQL statement:
IMPORT FROM ‘/dropbox/S0001432066/AMEX.ctl’
Note the name of the folder I created in step 4 (folder S0001432066), /dropbox/ is a prefix. After a while you will get the following result back:
Statement ‘IMPORT FROM ‘/dropbox/S0001432066/AMEX.ctl” successfully executed in 7:22.046 minutes – Rows Affected: 0
Hang on I hear you thinking. 0 Rows? No it’s not actually. You can check by firing off the following SQL statement:
select count(*) from NASDAQ
That looks promising! Let’s check some more:
We have data! Look at the log file:
Fetched 30 row(s) in 15 ms
Wow!
Step 5 – Create an Analytic view for reporting
First step create a package which will hold your view:
Create a so-called Analytic View:
Give it a name:
Select your table:
Drag your objects into the attributes and measures sections:
Validate, save and activate your view. Well done! We can use this view in Explorer and Excel. Note that you can preview your data and even auto document if by using these buttons:
Important! In case preview fails it is likely you have to “grant” your schema by executing this statement:
grant select on schema s0001432066 to _SYS_REPO with grant option
Replace s0001432066 with your own namespace ofcourse.
As an extra step your could create an extra table which holds the stock names. If you follow the same procedure as for creating the table with the stock records, you can join your two tables and have texts together with the stock names. Sequence for this would be:
- Create and load your text table
- Create an attribute view
- Link your analytic view together with your attribute view
Result would be:
Preview:
Step 6 – Using Explorer to report on your data
On your desktop a shortcut to Explorer can be found:
Fire it off and be sure to enter the correct server which can be found in your invitation mail:
Go to manage spaces :
Select your view and press “New”:
Give it a name:
Put in your objects:
And press ok! Don’t forget to index your data.
Refresh your page and you have a new information space:
Press it and start analysing! This one is for you Steve. Apple stock prices in the year he had to leave Apple:
Be sure to select enough data to get a nice trend.
Step 7 – Using Excel to report on your data
There is also a possibility to use Excel pivot tables for your reporting needs. Fire off Excel and connect to your view:
Choose Other/advanced:
Select the MDX connection:
Enter server credentials (check your invitation mail if not known):
Select your view:
You now have your data in a pivot table format. Set some filters and analyse your data at great speed!
Note that data retrieval is at great speed, but Excel makes it a bit sluggish in a RDP session.
Many thanks for bearing with me till the end and good luck with this great opportunity to test drive HANA!
thanks a lot for this awesome getting started blog. We'll make sure every new user of the Developer Center sees this 🙂
And you just earned yourself another free month on the Developer Center - your access is extended until Feb 28. I hope you'll share more of your findings with the community!
cheers
--Juergen
Congrats , I don have HANA Developer Access but seeing your blog i understood how it would look like if i have access.
Please do contribute more of these blogs, So that we can learn and experiment the same when we get a chance for HANA Hands-On.
Regards,
Krishna Tangudu
If you want to try the steps yourself, go to http://sapdevcenter.com and request access to the sandbox systems. Without an invitation code (aka access code) it might take a few days, but basically every SCN user can get access.
--Juergen
Thanks for your blog and it is lot helpful. Please keep continue your sequence.
Best regards,
Ramesh Choragudi
This blog is awesome and it helped me. Please keep continue your sequence on this.
Best regards,
Ramesh Choragudi
Hi Ronald,
No doubt it is a very great blog and I got lots of learning by reading your blog. But the clod share currently has SPS05 and the environment is completely different and specially from step 5 from creating analytic view the wizard has changed. It will be really great if you can update the current blog to work for SPS05.
Thanks,
Kaustubh
Your blog is amazing for jump start on HANA.
Appreciate your time and efforts.
-Naveen
Hi,
I am on hana server 7: hanasvr-07 00
I have been following the NASDAQ guide from the wiki. I retrieved the data from infochimps and uploaded it to the folders:
/S0004092614 and /dropbox/S0004092614
I tried following the various guides for importing the data; i continuously keep getting the error. I followed the guide posted here:
http://scn.sap.com/people/r.konijnenburg/blog/2011/12/26/sap-hana-developer-access-beta-program--get-the-best-out-of-your-test-drive
I then followed the guide for the table for USERS in this guide:
http://wiki.sdn.sap.com/wiki/display/inmemory/Importing+CSV+files+into+SAP+HANA
It still gives an error:
SAP DBTech JDBC: [2]: general error: Cannot open Control file, /dropbox/S00040926142/USERS.CTL
No error log is generated either .Am i missing something?
Hi Zeeshan,
I have stumbled upon the same problem faced by you. Could you let me know how you resolved your problem of SAP DBTech JDBC: [2]: general error: Cannot open Control file?
Hi Ravi,
I tried different things with the file; I do not recall the entire list of things which I did; i believe i touched upon the folder settings; the permissions; i also checked the file format and removed a couple of parameters from the load file syntax.
Hi Ronald:
Can you please share the link for downloading the data files ? I have got HANA box access.
Thank you,
Ronald, Raghu,
I just got access to the sandbox.
Can anyone post details on implementing Step2 - toget big data from Infochimps?
I am assuming that you can move further only after completing Step2 to create Step3, 4, etc.
Thanks
Certainly Helpful. 🙂
I was amazed with the clear path of explanation and instructions provided.
I'm glad to take it from here and reach the next level. ➕
Thank you much Ronald Konijnenburg
Hi Ronald,
Can we publish Calculated Measures or calculated view in BO Explorer.
If Yes Please give us input as iam not able to see calculated Measure in Bo explorer.
Thanks In Advance.
Phani Rayasam
Hi Ronald,
can you please re-upload the pics to this blog space (the dropbox pics disappeared)?
Thank you.
Uwe Fetzer
Hi experts,
could please explain about how to generate data for calmonth,and cal quarter using calday in SAP Hana.
i have a table contains day wise data. i have to transform into month and quarter.
Regards,
sivaumar
sorry
i am new to this.how can i ask for access code from http://sapdevcenter.com.Can someone guide me with details
some errors
when i go to manage spaces .the error message "Failed to retrieve the data source list,Request timed out"
who knows why?
Problem with ftp.
I followed the guidelines for exchaaging files using ftp. I figured out the ftp user using the sql command as indicated above. But as soon as I try to open an ftp session I get the following message:
The host 'ftp.sapdevcenter.com' is unreachable.
I tried several times. It seems to have fallen asleep. Is there anything you can do about this?
Problem is solved. It was due to the ftp program (which always worked so far). Nonetheless I tried another one and it worked.
hi Ronald,
thank you for the great blog which helped me to better understand HANA. unfortunately my BusinessObjects Explorer stated "Failed to index the information space. Request timed out". 😯
Hello community!
I have got a very big and strange problem at Step 4 - FTP and import your files into the HANA system.
- I am using FILEZILLA on Windows 7 as my FTP-Client.
- I logged in with the Username & Password, everything works fine
- BUT: I can not load my *csv-Files into my folder, which I created, there is always the following error code: 226 Transfer done (but failed to open directory).
=> The strange thing is, that I CAN load my *csv-Files in any other existing folder, but not in my own!??!
The Name of my folder is: S0002866913
Thank you very much for helping.
Problem solved.
There you can see the solution.
Regards
Hi Experts,
While loading the data i am getting below error:
Could not execute 'IMPORT FROM '/dropbox/DINESH/AMEX.ctl'
SAP DBTech JDBC: [139]: current operation cancelled by request and transaction rolled back: CSV file has some bad lines and cannot create bad file. some records are loaded.
My Control file is:
Import data
into table P1527334943."NASDAQ"
from 'AMEX.csv'
record delimited by ' '
fields delimited by ','
optionally enclosed by '"'
error log 'Text_Tables.err
Thanks,
Dinesh
Hi Ronald,
What a powerful article from you.
Really it will be a good start to all who want to taste HANA and will be delicious as you added main masala to it.
Thank you and keep posting.
Regards
Raj
Tx a lot, made me smile!
Hi Ronald,
A very good getting started guide, thanks. One issue - I am seeing an error with file permissions, in that I do not have permission to view the .err file generated during the CSV upload into HANA. I also do not have sufficient authorisations to change the .err file permissions via FTP, so I am blind as to what the errors in the .err file actually are.
In case anyone else has the same issue (or knows how to fix it), the problem is here
Hi,
I uploaded the data into Linux box under S0005971056, and used the import command, it is not giving any error, but it is producing the error file same size as the CSV file, unfortunatley I am not able to open the error file.
I try to change the file permissions to 777, but chmod command is failing all the time.
I have access to hanaserver-03, you can look at amexnew.csv and amexnew.ctl.
Thanks in advance for your help.
Hey,
thanks for this great post. But i cannot see any pictures... Is your dropbox closed or so? Can you please attach the screenshots again to this post.
Thanks
Regards
/edit: Sorry.. it was the proxy... the pictures will show on my personal notebook Thanks
What would be the procedure once you have say NYSE data loaded and AMEX data, and now you want to look at both of those combined? I realize I could just merge the two data files, but what if I wanted HANA to UNION these two tables essentially, would I do that as an Analytic view? Can someone give me some high level steps on what I would want to do there?
Hey Guys,
I am new to SAP and trying to self educate SAP HANA.
I have followed the procedure step by step listed in this document and tried ever possible trick people have shared but still not able to import .csv file into HANA.
My Folder Name is: P1716453917
I download .csv files from infochimps and merged them together via cat *.csv > AMEX.csv (using mac where copy command doesnt work, cat command returns the same result)
Created .clt file, this is what it looks like:
Import data
into table P1716453917."NASDAQ"
from 'AMEX.csv'
record delimited by '\n'
fields delimited by ','
optionally enclosed by '"'
error log 'Text_Tables.err
Uploaded both the files to my user folder via FileZilla
Path: /P1716453917
The step i am having issue with is importing data into HANA. I am using tcommand
IMPORT FROM '/dropbox/P1716453917/AMEX.ctl'
and it throws the following error:
"Could not execute 'IMPORT FROM '/dropbox/P1716453917/AMEX.ctl'' in 73 ms 542 µs . SAP DBTech JDBC: [2]: general error: Error processing a statement at "{\rtf1\ansi\ansicpg1252\cocoartf1187\cocoasubrtf340" "
File permissions for both my files are set to 777.
I have tried modifying the import part to include another /dropbox, but doing so throws "Cannot open Control file" error.
I have already spend few hours on it and getting no where.
I am hoping someone from this thread could triage my issue.
Appreciate any advice or resolution in advance.
Thanks
Hello Everyone,
Can someone send me those files please, I can't download them......what are infochimps??
Best Regards
Hi Everyone,
As HANA studio SP5 shows query execution and processing time once you execute the query, has anyone experienced comparatively longer server query execution time for column table than row table ?
Hi,
Youtube link given at very beginning is not available. KIndly check.
reg,avinash M