In my previous HANA and R blogs, I have been forced to create .csv files from HANA and read them on R…an easy but also boring procedure…specially if your R report is supposed to be run on a regular basis…having to create an .csv file every time you need to run your report it’s not a nice thing…
After spending some time reading and researching R…I finally came to a library that can read data from any relational database and being HANA, ODBC capable, the work is just a piece of cake -;)
For this examples, we must install two libraries: RODBC and Plotrix and create the DSN connection as shown here…
HANA already provides us a driver, so we’re cool
Here we’re going to “Add…” a new “System DSN”
Assign a name for the “Data Source Name”, “Description” is optional and “Server:Port” should be of course filled.
Now…we’re ready to go to our HANA studio an create a table and a stored procedure…
GetTicketsByYearMonth |
---|
CREATE PROCEDURE GetTicketsByYearMonth(IN var_year NVARCHAR(4),IN var_month NVARCHAR(2)) LANGUAGE SQLSCRIPT AS BEGIN select count(bookid), carridfrom sflight.snvoice where year(fldate) = VAR_YEAR and month(fldate) = VAR_MONTH group by carrid into TICKETS_BY_YEAR_MONTH; END; |
After we run our Stored Procedure…we have all the information in the table…Ok…only two fields…today was a hard day…I’m tired -:P
Finally…we can code some R! First, we’re going to create a Fan Plot (The Plotix library is needed for that one) and then a Bar Plot…I used the same code for both, so just replace the comment on one by the other one and run it again…I know…I’m being lazy again…but at least I’m not reinveting the wheel -;) Two codes with only 1 different line? No thanks…
Fan_Plot_Graph.R |
---|
library(“plotrix”) library(“RODBC”) ch<-odbcConnect(“HANA”,uid=”P075400″,pwd=”***”) res<-sqlFetch(ch,”P075400.TICKETS_BY_YEAR_MONTH”) fan.plot(res$TICKETS,labels=res$CARRIER, main=”Tickets for December 2011″) #barplot(res$TICKETS,names.arg=res$CARRIER) odbcClose(ch) |
The code is very simple…we call the libraries we need, we stablish a communication to our DSN, we fetch the data from the table, we create the graphics and finally we close the connection.
And here come the graphics…
I will keep investigating on this way to connect HANA and R…more blogs should be on the way -;)
Nice demo of how to put the high speed HANA db engine to some possibly advanced analysis that goes beyond counting and averaging item line counts… *g*
Thanks for that and keep on pushing out your blog posts!
regards,
Lars
For sure I will keeping this blogs! -:D
Greetings,
Blag.
Hi Alvaro,
I try to use R as you describe it within the 30 – day free trial HANA in cloudshare.
It is not clear where/ how to make the first step for the data source
(“HANA already provides us a driver, so we’re cool”. Seems I’m not coll enough)
Please give me a hint, is it done through BOBJ Data Services?
Gracias,
Sorin
Sorin: I’m just creating an ODBC connection.
Control Panel –> Administrative Tools –> Data Sources (ODBC)
When you get there, on the first tab User DSN, press Add and you’re going to have the first screen of this blog 🙂
Greetings,
Blag.
Thanks Alvaro, it works (with another table):
> head(res)
CARRID CONNID PAYMENTSUM
1 AA 17 194.332,71
2 AA 17 187.802,50
3 AA 17 193.300,75
4 AA 17 192.712,75
5 AA 17 192.797,31
6 AA 17 190.953,28
Can you give me a hint how I could save (somehow changed) data from R into the table / into HANA,
Gracias again
Sorin
Sure 🙂
Just follow this link http://scn.sap.com/community/hana-in-memory/blog/2012/01/29/r-meets-hana
And pay special attention to the last comment…
Greetings,
Blag.
Hi Blag,
How do I access a calculation view using R?
I tried with :
tab_claims<-sqlFetch(ch,”_SYS_BIC.claims/CLAIMS_LOCATION”)
barplot(tab_claims$Handling_Cost,names.arg=tab_claims$Location, main=”Claims by Location”)
But i am getting error:
“Error in tab_claims$Handling_Cost :
$ operator is invalid for atomic vectors”
claims : package
CLAIMS_LOCATION : Calculation view
Location : attribute
Handling_Cost : Measure
Thanks & Regards,
Rahul Rajagopalan Nair
Rahul:
My new hero posted an awesome comment in one of my blogs…
http://scn.sap.com/community/hana-in-memory/blog/2012/01/29/r-meets-hana#comment-334850
All you need is read it…and start coding 😉
Greetings,
Blag.
Thanks a lot Blag. I was prepared to wait for 24 hrs to get a response. But you are simply superb. “Blag the Flash” 🙂
Thanks & Regards,
Rahul Rajagopalan Nair
Alvaro, can you please elaborate more in detail the first couple of steps? What are the images with “Create new data source” and “Add new system DSN”? When I try to open the ODBC connection, it tells me
[RODBC] ERROR: state IM002, code 0, message [unixODBC][Driver Manager]Data source name not found, and no default driver specified
so obviously the driver is not specified. I’m on AWS, Studio runs on Linux. First I could not install RODBC because sql.h was missing, I got it from unixODBC-devel, maybe it was not the right one?
Cheers,
— Micha
Hello Micha:
My screen are for Windows and according to your error, you’re working on Linux.
For Linux you can setup the ODBC like it’s explained in this blog:
http://scn.sap.com/community/developer-center/hana/blog/2013/02/07/php-rocks-on-sap-hana-too-linux-version
Greetings,
Blag.
Great Post Alvaro!
But it’s not so clear for me, SAP PAL and R are different tools, OK, but what’s the advantage to use one instead other? They both deliver the same?
Thanks,
Rodrigo.
Rodrigo:
Yep…you got things a little bit mix up 🙂
PAL –> Predictive Analytic Library is a set of tools that are used in SAP HANA to add Predictive capabilities.
R –> Open Source Statistical Programming Language based on the S programming language. R is completely standalone and it can be used with SAP HANA either via ODBC or RServe which is a library that allows to connect R and SAP HANA.
PAL is great if you don’t know R…R is great if you know how to use it…but then again…being a programming language the learning curve of R is way more greater than learning PAL which is already optimized to achieve some calculations on top of SAP HANA 🙂
Greetings,
Blag.
Developer Experience.
Great Alvaro, Now it’s clear for me! 🙂
Thanks,
Rodrigo.
Hi Everyone,
I am a newbie, and I want to know if we can connect R via RODBC from my machine to HANA DB, that is on another server. Also, does R Server a mandatory tool to run RStudio, because when installing R studio on my machine, I had to install (before) R, and for my understanding it is the R server.
Thanks,
Imane
Hello Imane:
Yes, you can connect R to an HANA DB using RODBC…as it was demonstrated in this blog…as HANA is going to always be on a external server…
R is the programming language…RStudio is an IDE that allows us to work better with R…
You have your SAP HANA Server let’s say on AWS and from your local machine you can create an ODBC connection and call it from R using RODBC 🙂
Greetings,
Blag.
Development Culture.
Thank you very much for your answer.
While searching on how to connect R to HANA server, I found that we can do it with RHANA (developed by SAP HANA team). A friend of mine told me that it is no longer supported by SAP HANA, but I didn’t find any documentation that prove it, neither where to download it. I am interested by it because in the SAP HANA documentation about the different R-Hana integration, they said that RHANA is build for transferring large amount of data in the contrast of JDBC/ODBC.
Do you have any idea about it?
Greetings,
Imane,
http://scn.sap.com/thread/3172072 :Answer to my question is here, for the guys that are interested in it.
Hi,
Do we need a certain version of compatibility between R and SAP HANA?
Best regards,
Imane
I have tested with R 3.2.3 and Rserve 1.8.4…however…SAP recommends (as has been officially tested) to use R 2.15…
Greetings,
Blag.
Development Culture.
Expert Analytics User Guide,
SAP Predictive Analytics 2.4
2015-11-30
p. 100
The list of supported R versions has been extended to include version 3.1.2.
–> probably correct also for HANA / R
Thanks Sorin! I didn’t have an idea 🙂 Good to see that 3.1.2 is supported…and sure…I assume it must be the same for HANA/R 😉
Greetings,
Blag.
Development Culture.
SAP Predictive Analytics and SAP HANA are completely different products.
So, there’s no link between the supported versions of RSERVE of either of them.
Instead of guessing, why not simply check SAP note
and find that currently even R 3.2 and RSERVE 1.7-3 are supported for SAP HANA and R integration?
Thanks Lars 🙂 Nothing good comes from guessing 🙁
Greetings,
Blag.
Development Culture.
Hi Alvaro,
Is there a possibility to list tables in a database ? I have the name of the data source but I don’t know what are the tables that it contains to fetch one of them.
regards,
IMane
I think the answer is :
(Using RODBC)
sqlTables(channel)
Imane
Hi Alvaro,
Can we use same host where R & R serve installed to all Non-prod systems. Such as for Dev, QA & Prod as well. if yes, is there any system specification for installation.
Or we need separate R Installation for each system. Please suggest.
regards,
Prashanth