Skip to Content

As you might have discovered by now…I love R…it’s just an amazing programming language…

By now…I have integrate R and SAP HANA via ODBC and via the SAP HANA-R integration…but I have completely left out the SAP HANA OData capabilities.

For this blog, we’re going to create a simple Attribute View, expose it via SAP HANA and then consume it on R to display a nice and fancy graphic 😉

First, let’s create an Attribute View and call it FLIGHTS. This Attribute View is going to be composed of the tables SPFLI, SCARR and SFLIGHT and will output the fields PRICE, CURRENCY, CITYFROM, CITYTO, DISTANCE, CARRID and CARRNAME. If you wonder why so many fields? Just so I can use it in another examples 😉

OData_AttrView_001.jpgWith the Attribute View ready, we can create a project in the repository and the necessary files to expose it as an OData service.

First, we create the .xsapp file…which should be empty 😛

Then, we create the .xsaccess file with the following code…

.xsaccess

{

          “exposed” : true,

          “authentication” : [ { “method” : “Basic” } ]

}

Finally, we create a file called flights.xsodata

flights.xsodata

service {

          “BlagStuff/FLIGHTS.attributeview” as “FLIGHTS” keys generate local “Id”;

}

When everything is ready…we can call our service to test it…we can call it as either JSON or XML. For this example, we’re going to call it as XML.

OData_AttrView_002.jpg

Now that we know its working…we can go and code with R 😀 For this…we’re going to need 3 packages (That you can install via RStudio or R itself), ggplot2, RCurl and XML.

HANA_OData_and_R.R

library(“ggplot2”)

library(“RCurl”)

library(“XML”)

web_page = getURL(“XXX:8000/BlagStuff/flights.xsodata/FLIGHTS?$format=xml”, userpwd = “SYSTEM:******”)

doc <- xmlTreeParse(web_page, getDTD = F,useInternalNodes=T)

r <- xmlRoot(doc)

carrid<-list()

carrid_list<-list()

carrid_big_list<-list()

price<-list()

price_list<-list()

price_big_list<-list()

currency<-list()

currency_list<-list()

currency_big_list<-list()

for(i in 5:xmlSize(r)){

  carrid[1]<-xmlValue(r[[i]][[5]][[1]][[2]])

  carrid_list[i]<-carrid[1]

  price[1]<-xmlValue(r[[i]][[5]][[1]][[8]])

  price_list[i]<-price[1]

  currency[1]<-xmlValue(r[[i]][[5]][[1]][[7]])

  currency_list[i]<-currency[1] 

}

carrid_big_list<-unlist(carrid_list)

price_big_list<-unlist(price_list)

currency_big_list<-unlist(currency_list)

flights_table<-data.frame(CARRID=as.character(carrid_big_list),PRICE=as.numeric(price_big_list),

                          CURRENCY=as.character(currency_big_list))

flights_agg<-aggregate(PRICE~.,data=flights_table, FUN=sum)

flights_agg<-flights_agg[order(flights_agg$CARRID),]

flights_table<-data.frame(CARRID=as.character(flights_agg$CARRID),PRICE=as.character(flights_agg$PRICE),

                          CURRENCY=as.character(flights_agg$CURRENCY))

ggplot(flights_table, aes(x=CARRID, y=PRICE, fill=CURRENCY)) + geom_histogram(binwidth=.5, position=”dodge”, stat=”identity”)

Basically, we’re are reading the OData service that comes in XML format and parsing it into a tree so we can extract it’s components. One thing that might call your attention is that we’re using xmlValue(r[[i]][[5]][[1]][[2]]) where i starts from 5.

Well…there’s an easy explanation 🙂 if we access our XML tree…the first value it’s going to be “feed”, the second “id” and so on…the fifth is going to be “entry” which is what we need. Then for the next [[5]]…inside “entry”, the first value it’s going to be “id”, the second “title” and so on…the fifth is going to be “content” which is what we need. Then for the next [[1]]…inside “content”, the first value it’s going to be “properties” which is what we need. And for the last [[2]]…inside “properties” the first value it’s going to be “id” and the second it’s going to “carrid” which is what we need. BTW, xmlValue will get the value of the XML tag 😛

In other words…we need to analyze the XML schema and determine what we need to extract…after that, we simply need to assign those values to variables and create our data.frame.

Then we create an aggregation to sum the PRICE values (In other words, we’re going to have the PRICE grouped by CARRID and CURRENCY), then we sort the values and finally we create a new data.frame so we can present the PRICE as character instead of numeric…just for better presentation of the graphic…

Finally…we call the plot and we’re done 🙂

OData_AttrView_003.jpg

Happy plotting! 🙂

To report this post you need to login first.

7 Comments

You must be Logged on to comment or reply to a post.

  1. Witalij Rudnicki

    Blag, nice hack 🙂 but then what would be the advantage of going OData/XML way i/o ODBC/SQL? One disadvantage of XML would be the semantic overhead, so I assume it has to transfer lots of extra bytes (in form of tags). The second would be that you need to know the XML structure upfront to code the data import. What if it changes in the future? Surprising there is no JSON import library for R…

    On the plus side is obviously HTTP communication (simpler network configuration). Anything else?

    Thanks, mate.

    (0) 
      1. Alvaro Tejada Galindo Post author

        Vitaliy:

        Thanks for your comments 🙂 Yes…RJSON is really good but being a blog…I wanted to show something a little bit more complex…like XML processing…I want people to learn R 😉

        Regarding OData/XML versus ODBC/SQL…well…simply put…OData/XML is a little bit faster…

        proc.time() – ptm

           user  system elapsed

           4.89    0.17    5.96

        compared to ODBC/SQL

        proc.time() – ptm

           user  system elapsed

           6.60    0.02    6.77

        This might not be a huge performance improvement…but it’s a really small example…with big data…it might really matter 😉 Also…I took the whole process into account and not only the data extraction…

        Greetings,

        Blag.

        (0) 

Leave a Reply