R Integration with HANA – Test of “Outside In Approach” – Part 1
R is an open source programming language which is popular among data scientists, statisticians and data miners for developing software for statistical computing and data visualization graphics. It is widely used for advance data analysis.
SAP HANA also provides excellent support for advance analytics. R language can also be used for SQL procedures in HANA and HANA supports embedding of R code. However R environment supports many and much more advanced set of statistical functions and growing set of new libraries of functions. When R environments more advanced capabilities are needed in HANA SQL procedures, then HANA can be integrated with R environment as shown in image below.
In this approach, the advanced analytical and statistical capabilities of R environment are leveraged by transferring the data from HANA to R and aggregated results are sent back to HANA.
However, how about a scenario where HANA is treated as just another database and connected using ODBC or JDBC ?. I thought of testing a use case in which R session can fetch data from HANA Tables into its environment. I term it as Outside in approach as this way of reading data can be viewed from R developers perspective.I feel if this works it opens lots of possibilities in the scenarios where ERP data needs to be viewed and merged with data from other outside domains. However I am thinking of using example data from a previous exercise in which i was able to save twitter messages into a HANA table. Although lot of linguistic analysis can be done in HANA, much more extensive analysis using regular expressions is possible in R environment.
RStudio console snapshot images of a successful connection to HANA system are provided below.
tweets <- sqlFetch(connection,”MyHanaSchema.Table”) reads the data from the HANA Table and
Here is the snapshot image of the tweets data in R .
Note : Duplicate twitter message also exists in HANA Table.
There are few parameters in sqlFetch() function that allows control on number of records that can be fetched, next i am going to test that and write my own select statement and run it with function sqlQuery() to execute that SQL or even try JDBC connection with HANA ?