R meets HANA
If you read my last blog called HANA meets R you will remember that we read data from HANA into R directly, without having to download an .csv file, but using ODBC. This time, we’re going to read data from HANA as well, but after do some nice tricks on R, we’re going to post back the information into HANA.
Keep in mind, that is not an standard SAP solution. This only relies on a custom R package that can work with ODBC enabled tables, and like any custom packages, there are many limitations…anyway…this should be fixed when SAP released the official R into HANA integration.
In my previous blog Prediction model with HANA and R we create a stored procedure in HANA to populate a table called TICKETS_BY_YEAR, then on R we calculate the prediction for the next year and generate a nice graphic showing both the real data and the prediction. So…of course I’m not going to repeat all that.
This is the R code that we need to use…
R_Meets_HANA.R |
---|
library(“RODBC”) ch<-odbcConnect(“HANA”,uid=”P075400″,pwd=”***”) Flight_Tickets<-sqlFetch(ch,”P075400.TICKETS_BY_YEAR”) period=Flight_Tickets$PERIOD tickets=Flight_Tickets$TICKETS var_year=substr(period[1],1,4) var_year=as.integer(var_year) var_year=var_year+1 var_year=as.character(var_year) new_period=gsub(“^\d{4}”,var_year,period) next_year=data.frame(year=new_period,stringsAsFactors=FALSE) prt.lm=lm(tickets ~ period) pred=predict(prt.lm,next_year,interval=”none”) period=next_year tickets=pred PREDICTION_TICKETS<-data.frame(period,tickets) sqlDrop(ch,”PREDICTION_TICKETS”,errors=FALSE) sqlSave(ch,PREDICTION_TICKETS,rownames=”id”) odbcClose(ch) |
After we execute this code, we can check on HANA that our new table called PREDICTION_TICKETS was created…
And the data was populated as expected…
You may wonder…which are the limitations? Everything seems to work like a charm? Easy…not a lot, but important limitations…
* We don’t have a way to validate if the table exists or not.
* We must delete the table before doing the insert, otherwise is not going to work.
* Even when the date field was called PERIOD, R named it “year” and pass it into HANA.
* We can’t specify the type of the fields, nor the lenght
* We are forced to have an additional column with a numeric index, that we can nicely call “Id”…
As I said early…this is just a custom package that allows us to play…this shouldn’t be used as a final solution, but as a playground. Enjoy!
Hi Alvaro,
It would be very helpful if you can fix the images.
Thanks
Venkat
Done! Pictures were fixed! 😉
Greetings,
Blag.
Hi Alvaro,
Thanks! Just a comment. Almost all limations are correct (snif...) but one that can be corrected
... We are forced to have an additional column with a numeric index...
With this changes could be corrected:
Flight_Tickets<-sqlFetch(ch,"P075400.TICKETS_BY_YEAR",rownames ="id")
...
sqlSave(ch,PREDICTION_TICKETS,rownames="id",addPK=TRUE)
I have done it with others examples and works, I not still tested with this one.
Best
David
Hi Alvaro,
thanks a lot, did you find a way to consume HANA Views through sqlFetch? I tried _SYS_BIC.package/VIEW but get always table not found on channel message.
Best regards
Alex
Alexander:
No...I got the same error...my assumption...is that it's not possible... Attribute/Analytical/Calculation Views are not exactly tables...as are mainly a combination of tables and with an important difference from regular views...they don't contain any data until they are called and executed...RODBC wasn't built think on In-Memory...so they will only access tables on a regular schema...still...will keep trying until I can find a way to make it work...there's always a way 😉 If I make it work, I will publish it as a blog with all the explanations 🙂
Greetings,
Blag.
Hi there and a Happy New Year to all!
RODBC is totally capable to access the HANA model data (analytic/calculation views).
The problem here is that the sqlFetch method is basically performs a
SELECT * FROM <table/view>.
To to that the ODBC driver first needs to figure out what data to expect and therefore tries to look up the table meta data in the HANA dictionary (e.g. in system views like sys.table_columns_odbc).
But if we check this view, we find that the entries there are not listed with
TABLE_NAME = package/<view_name>.
Instead we find our column description associated with
TABLE_NAME = package/<view_name>/proc/tabletype/VAR_OUT
That's the case for calculation views and completely correct, as the table types define the output structures for procedures and calc. views (which are internally represented as stored procedures).
That's the reason for the "table not found error".
Now, how can you access the table then?
Easy, just by using sqlQuery instead!
Very important for this is to remember to always include at least one aggregation function in your SQL query when you access calcviews.
This is a big difference to plain SQL tables and views and shouldn't be overlooked at any time.
Calcviews really are calculation scenarios - not just joins and unions.
They are specifically designed to deliver aggregated data and HANA does execute the calculation scenario automatically when you query such a calcview.
(BTW: an easy way to get a generic SQL statement that includes aggregation on all keyfigures as well as all attributes is to start a data preview and click on the little [i] View Log... button. In the dialog box that opens then the SQL statement is copy-able on display 😉 )
'nuff said, let's see action:
Open the connection:
library("RODBC")
ch<-odbcConnect("HAN",uid="LARS",pwd="****")
The calc view I want to access is named "efashion/CE_EFASHION".
I used the data preview method and modified the SQL statement to fit my requirements:
EFashion_sales<-sqlQuery(ch,
'SELECT "YEAR",
sum("QUANTITY_SOLD_1") AS "QUANTITY_SOLD_1",
sum("AMOUNT_SOLD") AS "AMOUNT_SOLD"
FROM "_SYS_BIC"."efashion/CE_EFASHION"
GROUP BY "YEAR"' )
Now the query had been executed and the data is available via EFashion.
Easy cheesy - but probably not too obvious.
What's important here to remember is that Calcviews don't make sense if you don't actually query them with aggregation functions over the key figures.
Cheers,
Lars
Lars!
Happy New Year to you too 🙂
Thanks for this awesome response! You totally make me look like a newbie 😛
Will make sure to investigare more next time 😉
Greetings,
Blag.
Sorry Blag, didn't mean to put you in a bad light.
These HANA specifics really aren't too obvious most of the time and a lot of concepts have been hidden from the end users.
Custom data access from 3rd party programs is only supported since a few months, so there is just no way anybody can claim to have huge amount of experience with that.
Looks like we have to find out all the details together in the upcoming months 😉 .
- Lars
Lars...no need to be sorry...sometimes being put in a bad light is exactly the thing one needs to realize that there's always more to learn...so I should say "Thanks" 🙂
And you're right...SAP HANA is still evolving and 3rd party support is evolving as well...
Like you said...we will for sure work together to keep showing the world how awesome SAP HANA can be 😉
Greetings,
Blag.
Hi Lars and Alvaro,
all the best for the new year, especially a lot of new experiences with HANA in 2013.
@Lars: this was exactly, what I have been looking for - thanks for the perfect explanation!
Best regards
Alex
Alex:
The best for you in this new year as well 🙂 So glad that Lars jumped in...SAP HANA is huge...and the more people we have trying to make the most out of it...the better for everyone 😉
Greetings,
Blag.
Thanks a lot Lars.
A note for fellow developers, to access calc views it is sqlQuery and not sqlFetch. Just in case you did not pay attention to Lars's reply.