If you follow my blogs (I hope you do 😡 ) then you know I really love the R programming language but I also love SAP HANA 😘 and in the past I have dealt with integration between those two:

HANA meets R

R meets HANA

Sanitizing data in SAP HANA with R

But…those integrations were not done using the SAP way…which means, they are not supported or endorsed by SAP…

Gladly…as of today, there’s an official SAP way to do it! 😎

First, we need to read and follow all the instructions detailed in Get your own SAP HANA DB server on Amazon Web Services by the most awesome Juergen Schmerder. (Believe! It took me less than 10 minutes to get my SAP HANA running on my laptop…really…a piece of cake).

With our SAP HANA instance up and running, we can dedicate ourselves to the funny part…the #R part 😉

Go to your AWS Management Console and under Amazon EC2, launch a new instance…

SUSU_Linux.png

You have to choose SUSE Linux Enterprise with 32 bit. I tried with 64 bit and it wasn’t funny…didn’t work and I lost a lot of time…32 bit for the win!

For the installation, you can follow this link SAP HANA Database Development Guide – Integration with R programming language, but at least in my case, I need to deal with a lot of difficulties, that gladly I’m going to write down in this blog, so you don’t have to deal with them 😆

First, we need a compiler as we’re going to compile #R from it’s source.

Get a compiler…
sudo zypper install gcc gcc-c++ gcc-fortran

Then we need to get and extract the #R source code.

Get source code, extract and compile

wget http://cran.r-project.org/src/base/R-2/R-2.13.0.tar.gz

tar zxf R-2.13.0.tar.gz && cd R-2.13.0

./configure –enable-R-shlib –with-readline=no –with-x=no

make clean

make

make install

This step really takes a long time…so you better go doing something more productive in the meantime…

When #R is finally installed, we need to download and install the Rserve package.

Download Rserve

wget http://www.rforge.net/Rserve/snapshot/Rserve_0.6-5.tar.gz

Now, we have to log into R and do the installation…

Install and test Rserve

R

install.packages(“/PATH_TO_FILE/Rserve.tar.gz”, repos = NULL)

library(“Rserve”) #To test the installation. If there’s no output, then it’s working fine

q()

If you get an error regarding a personal library…just say “y”.

Once Rserve is install, we need to create a config file.

Rserv.conf

vi /etc/Rserv.conf

maxinbuf 10000000

Maxsendbuf 0

remote enable

#Press ESC key

:w

#Press ESC key

:q!

Now, we have to create a user that will run the Rserve so we can connect to it from SAP HANA.

Create new user with password

useradd -m login_name

passwd login_name

For some reason Amazon doesn’t provide the password for the root user…but we might need it eventually…so just do this…after all, if your user and you’re paying for it…

Change root password

sudo passwd root

#Assign a password

Great, we’re are now ready to start our server! (You need to be logged as the new user that we create in a previous step).

Start the Rserve server
R CMD Rserve –RS-port 6311 –no-save –RS-encoding “utf8”

Now…we’re ready to move to move to our SAP HANA server and keep configuring 😏

Configure SAP HANA

Right click on your system node at the navigator tab

Select Administration

Select on the right hand side the Configuration tab

Select the indexserver.ini

Select the calcengine

#Add the following parameters…

cer_timeout – 300

cer_rserve_addresses – Our R Amazon server:6311

cer_rserve_maxsendsize – 0

One more thing, and we’re ready to roll…go to your AWS Management Console, EC2 and then choose Security Groups. Our R server is going to be assign to “Quicklaunch-1”. Just select it and go to Inbound. And add the port “6311”.

Security_Group.png

That’s all folks…we’re officially ready to go!

On SAP HANA, create a table and call it TICKETS_BY_YEAR with the following structure:

HANA_R_Kiss_001.png

Open a SQL Editor and copy the following code…

Fill up TICKETS_BY_YEAR

insert into “SYSTEM”.”TICKETS_BY_YEAR” values(‘20110101’,4195);

insert into “SYSTEM”.”TICKETS_BY_YEAR” values(‘20110201’,4245);

insert into “SYSTEM”.”TICKETS_BY_YEAR” values(‘20110301’,4971);

insert into “SYSTEM”.”TICKETS_BY_YEAR” values(‘20110401’,4469);

insert into “SYSTEM”.”TICKETS_BY_YEAR” values(‘20110501’,4257);

insert into “SYSTEM”.”TICKETS_BY_YEAR” values(‘20110601’,4973);

insert into “SYSTEM”.”TICKETS_BY_YEAR” values(‘20110701’,4470);

insert into “SYSTEM”.”TICKETS_BY_YEAR” values(‘20110801’,4981);

insert into “SYSTEM”.”TICKETS_BY_YEAR” values(‘20110901’,4530);

insert into “SYSTEM”.”TICKETS_BY_YEAR” values(‘20111001’,4167);

insert into “SYSTEM”.”TICKETS_BY_YEAR” values(‘20111101’,4059);

insert into “SYSTEM”.”TICKETS_BY_YEAR” values(‘20111201’,1483);

This table is supposed to hold the tickets sales for a given company, during each month of the year 2011. What we want to do here is to determine or predict how are going to our sales on 2012. We’re going to use #R for that matter.

Create the following script and call it “Predict_Tickets”. This script will have two Stored Procedures, called Prediction_Model and Get_Tickets.

Predict_Tickets

CREATE TYPE T_PREDICTION_TICKETS AS TABLE (

PERIOD VARCHAR(8),

TICKETS INTEGER

);

DROP PROCEDURE Prediction_Model;

DROP PROCEDURE Get_Tickets;

CREATE PROCEDURE Prediction_Model(IN tickets_year TICKETS_BY_YEAR,OUT result T_PREDICTION_TICKETS)

LANGUAGE RLANG AS

BEGIN

period=as.integer(tickets_year$PERIOD)

tickets=as.integer(tickets_year$TICKETS)

var_year=as.integer(substr(period[1],1,4))

var_year=var_year+1

new_period=gsub(“^\\d{4}”,var_year,period)

next_year=data.frame(year=new_period)

prt.lm<-lm(tickets ~ period)

pred=round(predict(prt.lm,next_year,interval=”none”))

result<-data.frame(PERIOD=new_period,TICKETS=pred)

END;

CREATE PROCEDURE Get_Tickets()

LANGUAGE SQLSCRIPT AS

BEGIN

Tickets = SELECT * FROM TICKETS_BY_YEAR;

CALL Prediction_Model(:Tickets,T_PREDICTION_TICKETS);

INSERT INTO “TICKETS_BY_YEAR” SELECT * FROM :T_PREDICTION_TICKETS;

END;

CALL Get_Tickets();

SELECT * FROM “TICKETS_BY_YEAR”;

As you can see, our first Stored Procedure called Prediction_Model, we’re using RLANG as the script language…meaning that we’re going to embed R code that will go from our SAP HANA to our R Serve and back with the modified information.

Prediction_Model is calling the Stored Procedure Get_Tickets, which is doing a select from the table TICKETS_BY_YEAR and then calling Prediction_Model to finally insert the data back into SAP HANA.

At the end of our script, we call Get_Tickets and do a select to TICKETS_BY_YEAR to verify that our script worked.

HANA_R_Kiss_002.png

Success! Our SAP HANA and R integration work like a charm! We never left SAP HANA Studio, but our code went to the R Server and back to bring us the modified information…all in just 829 milliseconds…really fast considering that both servers are in the cloud…

That’s all for now…I will come back with more information on SAP HANA and R as soon as I can…there’s still a lot to discover and test 😎

To report this post you need to login first.

66 Comments

You must be Logged on to comment or reply to a post.

  1. tilak mishra

    Alvaro,

    Can you please tell me, whats the logic did you use for determining  or predicting the sales for  2012.

    Sorry, am asking this question because i couldn’t able to understand the R-script.

    regards,

    Tilak

    (0) 
  2. Alex Liu

    thanks for your knowledge sharing, i also did it again follow your guide…all steps are ok, excluded some linux compiler part..  thanks:)

    (0) 
    1. Alvaro Tejada Galindo Post author

      Bernard:

      Thanks for your comments and for pointing me out to the broken links…I have fixed them…however, for the last one I need to link it to my personal blog, as it seems to be stranded in the sands of SCN 🙁

      Greetings,

      Blag.

      (0) 
  3. Francois Role

    Hi Alvaro,

    My Rserve server is listening on port 6311 on an aws instance but Hana doesn’t seem to be able to connect to it.

    – I added the cer_* parameters under system. There are also a default and tenant column but I do not know how to set parmeters for them.

    – I was not sure how to assign to “Quicklaunch-1”. So I created a new security group called  “Quicklaunch-1” on the server hosting Rserve and added the port 6311. To be sure I added the same rule to the “default” security group.

    I also checked that my R instance was readily accessible using its elastic ip.

    Any hints?

    Looking forward to overcoming this last difficulty before using R and Hana in combination!

    François

    (0) 
    1. Alvaro Tejada Galindo Post author

      Francois:

      Have you started the RServe Server? R CMD Rserve –RS-port 6311 –no-save –RS-encoding “utf8”

      About the default and tenant columns…forget about them…they’re not needed…

      The security group should be fine…as long as it linked to the server…

      When you log into the R Server you should see that the RServe Server is up and running…

      Greetings,

      Blag.

      (0) 
      1. Francois Role

        Alvaro,

        Thanks for your early reply.

        Yes, I did use the command : R CMD Rserve –RS-port 6311 –no-save –RS-encoding “utf8”

        According to netstat Rserve was up and running, listening on port 6311,

        I used this port for defining the security group on the R instance on the one hand and for setting cer_rserve_addresses  – <ip elatisc address of R instance>:6311 on the Hana instance.

        Is it necessary to create a “Quicklaunch-1” security group. How can we be sure that our instance is assigned to this group? By default there seem to be three: a “deafult” and two “emr-xxx”.

        Thanks for any suggestions you may have,

        François

        PS And also a newby question about Hana : I wondered why the test tables were created in the SYSTEM schema. I have not yet looked into how to create a new schema.

        (0) 
        1. Alvaro Tejada Galindo Post author

          Francois:

          There’s no need to create “Quicklaunch-1″…that’s just the name I kept for my instance…you just need to modify the security group assigned to your R instance and make sure that the port 6311 is opened.

          What’s is the script your using to connect to R? (The one on the blog I assume…)

          Well…my assumption (as I didn’t create it) is that we used SYSTEM to have something more generic…the SAP HANA instance is an image so we needed to put the tables somewhere…and after all, we’re managers of our own SAP HANA Servers 😉

          Greetings,

          Blag.

          (0) 
          1. Francois Role

            Thanks Alvaro,

            It works! 🙂

            The problem had indeed to do with the security groups. To be sure I temporarily set the port 6311 for all of them (defaulyt may suffice ; I will try).

            I should learn more about security groups (how to assign an instance to a security ; knowing which security group is in force when several have been define, etc.)

            To be sure that my R server was accessible I used pyRserve, a nice Rserve client.

            Again thanks for your post which was very helpful,

            Francois

            Security groups automatically created by Amazon for my instance :

            default

            default group

            ElasticMapReduce-master

            Master group for Elastic MapReduce

            ElasticMapReduce-slave

            Slave group for Elastic MapReduce

            Are

            (0) 
            1. Alvaro Tejada Galindo Post author

              Francois:

              Awesome to know that you make it work! 😀 Happy SAP HANA-R programming 😉

              Yeah…Security Groups are not very obvious…I’m not totally sure about them myself LOL

              Greetings,

              Blag.

              (0) 
  4. Sorin Pascu

    Thanks Alvaro , it finally worked,

    my vi commands where

    vi /etc/Rserv.conf

    # press i  for insert

    maxinbuf 10000000

    Maxsendbuf 0

    remote enable

    #Press ESC key

    : x

    #Press enter

    Why does the Linux -/ R Server accept the request without authentication (since I don’t put in any user / password)?

    Is this done through  PUTTY pageant and the corresponding key? Probably not since they are local…

    (0) 
    1. Alvaro Tejada Galindo Post author

      Sorin!

      So glad that you make it work 🙂

      Regarding the R Server…as you’re using RServe as server, it will accept anonymus request as you’re not accessing the OS…and when you access the OS you’re passing a Pair Key, so no password is needed 🙂

      Greetings,

      Blag.

      (0) 
  5. Rahul Rajagopalan Nair

    Hi Blag,

    Is there a way to access Analytical views/Calculation views from R. Or is the access limited to tables only.

    Also if i have a table with data:

    Doc,Customer,Amount

    10001,Cust01,10

    10002,Cust01,20

    10003,Cust03,30

    and use the barplot function with cust & amount, I see three bars: Cust01-10, Cust01-20, Cust03-30.

    I was expecting Cust01-30, Cust03-30.

    Data does not get aggregated automatically in R.

    Have you faced similar issues?

    Thanks & Regards,

    Rahul Rajagopalan Nair

    (0) 
    1. Alvaro Tejada Galindo Post author

      Rahul:

      R is not actually accessing any table…we’re using a Store Procedure to pass the information to it…in this two lines…

      Tickets = SELECT * FROM TICKETS_BY_YEAR;

      CALL Prediction_Model(:Tickets,T_PREDICTION_TICKETS);

      Fist we do the select and the we pass it to the R procedure…

      To read from a Calc or Analytical view…you should do it like this…replacing the fist line for this one…

      Tickets = CE_CALC_VIEW(“_SYS_BIC”.”blag/CV_TICKETS”,[“PERIOD”,”TICKETS”];

      Were “blag” is the name of the package and “CV_TICKETS” is the name of the Calculation View…so in the end you will have…

      Tickets = CE_CALC_VIEW(“_SYS_BIC”.”blag/CV_TICKETS”,[“PERIOD”,”TICKETS”];

      CALL Prediction_Model(:Tickets,T_PREDICTION_TICKETS);

      Regarding your second question…barplot is a graphical function…it will do any aggregate by itself…and BTW…it’s not going to generate any graphic is you call it from SAP HANA, as the R server is a headless one…meaning…no graphics output…

      In pure R…you should do it like this…

      doc<-c(10001,10002,10003)

      customer<-c(“Cust01″,”Cust01″,”Cust03”)

      amount<-c(10,20,30)

      tab<-data.frame(doc,customer,amount)

      tab_sum<-aggregate(amount~customer,data=tab,FUN=sum)

      barplot(tab_sum$amount,names.arg=tab_sum$customer)

      Using the “aggregate” function to summarize the data you need.

      Greetings,

      Blag.

      (0) 
      1. Rahul Rajagopalan Nair

        Hi Blag,

        Thanks for the response. Let me repost this in your previous blog wherein you had initially accessed HANA tables in R.

        BTW your solution for using aggregate worked. thanks!.

        Best Regards,

        Rahul R Nair

        (0) 
  6. Rahul Rajagopalan Nair

    Hi Blag,

    I have heard about RHANA package. I tried to search it on CRAN & SMP. But not able to find it. Do you know if it has been released, if yes then where can i find it?

    Thanks & Regards,

    Rahul Rajagopalan Nair

    (0) 
    1. Alvaro Tejada Galindo Post author

      Hi Rahul:

      Well…if I’m not wrong…and I might be 😛 RHANA was developed internally and was never published outside SAP, I think it might have been discontinued…also, it was replaced by using the option detailed in my blog…SAP HANA with an R Server running RServe.

      Greetings,

      Blag.

      (0) 
      1. Rahul Rajagopalan Nair

        As always, thank you for your quick response Blag.

        I came across this document http://datatable.r-forge.r-project.org/randsaphana-dkom.pdf which stated that RHANA was a preferred option over RODBC. Too bad it didnt work. I totally understand that it would have taken a lot of effort to get a HANA specific R package ready and publish it on CRAN. Appreciate the team’s effort. 

        Yes, we are looking forward to test Rserve package. Waiting for the SPS5 upgrade on AWS.

        Thanks & Regards,

        Rahul Rajagopalan Nair

        (0) 
  7. Akshay Oak

    Hi Blag,

    First of all thanks for your post…. its really wonderful. And beauty is it worked…. 🙂

    Now, going step further I have a question. For some forecasting models, we need to use forecast packages in R… its only after installing right forecast package and then loading it during R program execution we can make use of certain R functions….

    A use case that I am trying involves such forecasting technique… with R I could make it work however, when I try to execute the same code I get errors ….

    Here are some insights into whats happening:

    1. If I dont load the forecast package using library(“forecast”)

    — System issues error saying forecast functions I am using are not available.

    2. If I use library(“”) – Just to isolate what problem might be…

    — System issues error saying Invalid package

    3. If I use correct package library(“forecast”)

    ” ….(at pos 88): GenericFailure exception: column store error: search table error:  [34084] Receive error: get result error.;An internal error in R client occured “

    Do you have any idea what might be wrong? or you know any elegant way of installing packages in R via SQL scripts?

    Thanks,

    AO.

    (0) 
    1. Alvaro Tejada Galindo Post author

      Akshay:

      Well…if you have installed the library on your R server…then it should work…and actually when you got an error on the SAP HANA Studio side, you can switch to your R server and see what’s the error message in the log.

      I have no tried the “forecast” library yet…but will do it tomorrow when I get back to Montreal…most surely together will be able to pinpoint any error and fix it 😉

      Greetings,

      Blag.

      (0) 
      1. Akshay Oak

        Hello Blag,

        Great to see your reply… 🙂

        Actually here is what I did,

        1. In R, I mostly get the required packages via CRAN mirror servers using neat UI and then load them in scripts using command –  library(“package_name”)

        – This is, of course not possible when doing it via SQLScript

        2. Packages that I have put are via ftp servers – this is definitely wrong way as I think it wont install all necessary dependent objects like it will when installation is called within script itself

        3. Also, maybe versions of package might be issue as well….?

        Lets try to knock this off….. thanks again.

        AO.

        (0) 
        1. Alvaro Tejada Galindo Post author

          Akshay:

          Well…what I did for my R Server, was to log into R and run this simple command…

          install.packages(“library_name”)

          Which is the same as using RStudio installation package option…that way…you guarantee that everything is going to get installed correctly…

          Greetings,

          Blag.

          (0) 
          1. Akshay Oak

            Blag,

            Thanks for the reply. With install.packages(“forecast”) – we get the error that forecast package is not available.

            Typical message is:

            Selection: 80

            Warning message:

            In getDependencies(pkgs, dependencies, available, lib) :

              package âforecastâ is not available (for R version 2.13.0)

            >

            The SUSE Linux has the R version i.e. 2.13.0

            The code that worked for me on my local R environment is on R version 2.15.2

            So I decided to go ahead with latest R on SUSE Linux on AWS. Looks like best way to appraoch is keeping both versions same, that way atleast debugging will be easier.

            Will install 2.15.2 on AWS and try again.

            Let me know you have other thoughts. Thanks!

            AO.

            (0) 
  8. Akshay Oak

    Hello Blag,

    So, made some progress on this topic:

    1. First, I got the same version as the one I am working on my local machine. So, even AWS has the same R version.

    2. On AWS  R server, I wrote R code and saw the output. Its coming up correct.

    3. Now, here is the tricky part: When I execute code via SQL script, I get internal server error. This error happens only when I try to loop back the result the way you have in your example and update the table in HANA.

    For example,

    I get two vectors in R, one vector for time buckets and other for forecasts (similar to Period and Tickets column ) in your example.

    Now, I bind them using cbind, so I get one matrix. Finally, I apply data.frame(matrix) to get the resultant data frame that has two columns, one for time periods and other for forecast.

    If I put something like print(dataframe) in R code of SQL script, I see output on Rserve. But, if I remove print command and expect to have feedback loop so that I can insert into table. I get internal R error….. so, maybe I am doing something wrong such as having wrong type or assigning data frame correctly etc.

    Any idea on what might go wrong if I intend to receive dataframe from R in SQL script? Thanks!

    AO.

    (0) 
    1. Alvaro Tejada Galindo Post author

      Akshay:

      Ok…let’s try this…as you can see in my example, I’m doing this…

      result<-data.frame(PERIOD=new_period,TICKETS=pred)

      So I’m actually assigning the names of the column from the table into the data.frame…maybe you’re only doing this…

      apply data.frame(matrix)

      When you should do something like this…

      a<-c(1,2,3)

      b<-c(4,5,6)

      ab<-cbind(a,b)

      abc<-data.frame(COLUMN1=ab[a],COLUMN2=ab[b])

      This way, your data.frame will have the table’s column names…

      Greetings,

      Blag.

      (0) 
      1. Akshay Oak

        Perfect Blag….. Its going through…. 🙂

        Now, I am all set to try some advanced use-cases in this area. Will buzz here just in case I get lost 😉

        Thanks!

        AO.

        (0) 
  9. Ronja Queck

    Hello Blag

    First of all I have to compliment you on your blogs. They are all really interesting and easy to understand!

    In your example, you are using R for calculations, but I want to do data mining with it. For this, R offers many different libraries. But my problem is: How can I return the knowledge which I have mined back into HANA? As you already mentioned, the R server is headless without graphics output. The only method I came up with is to output the results into files, but these do not look very nice and cannot be used for future data minings. Do you know of any example for data mining with HANA? Maybe even without R?

    (0) 
    1. Alvaro Tejada Galindo Post author

      Ronja:

      Well…if you read my blog http://scn.sap.com/community/developer-center/hana/blog/2012/06/25/when-sap-hana-met-r–bring-home-your-graphics you will see that there’s an option…create the graphics using the Cairo Library and send them by email…also, you can save the information on SAP HANA and the exported to a CSV file and explore it on SAP Visual Intelligence like I say on my blog http://scn.sap.com/community/developer-center/hana/blog/2013/02/18/when-sap-hana-met-r–whats-new

      I’m not aware of any other options…

      Greetings,

      Blag.

      (0) 
      1. Ronja Queck

        Hey Blag,

        thank you for your fast reply! I actually did look through the “Content” of your profile but apparently I didn’t get this far 🙂

        The first link is actually what I was looking for… Now I’ll only have to figure out how to email 🙂 Thank you so much!!

        (0) 
        1. Alvaro Tejada Galindo Post author

          How to sent the email is actually on the blog 😉

          command<-“uuencode Tickets.png Tickets.png | mail -s ‘Tickets December Report’ atejada@gmail.com

          system(command,intern=TRUE)

          It’s part of the code…basically, you use the uuencode command to send the email from the Linux box 🙂

          Greetings,

          Blag.

          (0) 
          1. Vjola Dule

            Hi Alvaro,

            i find your blog really good 😉

            I am abou to install a Linux Host for R but i dont’t know exactly the recommended values of the Linux Host

            We are going to install Linux Host in a VM and at the moment we have low capacity.

            I need to know th following parameters for the Linux Host:

            1. how much memory do i need for R and RServer (min) ?

            2. how much disk?

            3. how many Processors ?

            Thank you in advance

            Regards,

            Vjola

            (0) 
            1. Alvaro Tejada Galindo Post author

              Vjola:

              To be honest…I don’t know…R can be installed on any machine…of course, the amount of memory and processors will affect the performance but will not prevent you from running it…in terms of space…the installation takes from 75 to 100 Mb…so I don’t think that’s a real issue…

              While you install R and Rserve on a server with nothing else installed on it…you should be fine…

              Greetings,

              Blag.

              (0) 
  10. Micha Meier

    Hi Alvaro,

    I’ve tried your code, however when I try to define the procedure Get_Tickets(), it says

    * 257: sql syntax error: line 1 col 87 (at pos 87) SQLSTATE: HY000

    hdbsql HDB=> > * 7: feature not supported: Parameterized input table parameter is not allowed: line 1 col 23 (at pos 22) SQLSTATE: HY000

    hdbsql HDB=> > * 1309: invalid use of table variable: line 1 col 45 (at pos 44) SQLSTATE: HY000

    hdbsql HDB=> > END;

    * 257: sql syntax error: incorrect syntax near “END”: line 1 col 1 (at pos 1) SQLSTATE: HY000

    Can you see what is wrong?

    Cheers,

    — Micha

    (0) 
    1. Alvaro Tejada Galindo Post author

      Hello Micha:

      Well…this blog is almost from a year ago…so maybe something has changed in the newest SAP HANA Studio revision…will try to find some time to try to reproduce it….

      Or maybe it’s because you don’t have the R Server well configured or Rserve is not running correctly…have you follow all the steps? SAP HANA and R should installed on different machines…

      Greetings,

      Blag.

      (0) 
      1. Micha Meier

        Ok, I figured it out. Entering your code into Hana Studio works, entering it into hdbsql gives this error. Note that the error is not in the R procedure but in the SQLScript. Looks like a bug in dbsql to me…

        — Micha

        (0) 
        1. Alvaro Tejada Galindo Post author

          Micha:

          I haven’t use hdbsql…but my assumption is that the Studio holds the connection information to the R server…while the hdbsql doesn’t…there might be some way to make it work…but that might take some time to investigate…

          Greetings,

          Blag.

          (0) 
  11. Henrique Pinto

    Hi Blag,

    did you ever got it to work with SLES 64bit?
    Or at least found out the root cause of the errors you were getting with 64bit?

    Thanks,

    Henrique.

    (0) 
    1. Alvaro Tejada Galindo Post author

      Henrique:

      Not really…once I had it working on 32bit…I just forgot about the 64bit…will try to find some time next time to do some testing… 😉

      Greetings,

      Blag.

      (0) 
  12. Henrique Pinto

    Hi Blag,

    with your and Juergen‘s invaluable help, I was able to install Rserve 1.7-0 and R 2.15.3 with success. 🙂 At least the “hello world” script above executed successfully. 😀

    Now I’ll make some more tests with Predictive Analysis calling HANA R procedures.

    Thank you very much!

    Cheers,

    Henrique.

    (0) 
  13. Vjola Dule

    Hallo Alvaro,

    I have installed a trial 60 Day version of Linus SUSE 11 SP2, but the gcc-fortran is not in the Server installed.

    Where can I find a installation of FORTRAN

    Regards,

    Vjola

    (0) 
  14. B.-W. L.

    When I ran the following SQL Statements on SAP HANA Studio, I got the error message: Could not execute ‘CALL R_test(“xytest”) WITH OVERVIEW’ in 42 ms 418 µs.

    SAP DBTech JDBC: [2048]: column store error: search table error:  [34081] Connection error: Unable to connect with Rserve.;None of the Rserve can be connected: <host>:<port>:crypted: Authentication with Rserve failed;

    although the host (Elastic IP) and port number (in Security Group) are correct, R and Rserve are running when checked under PuTTy. If the last syntax “CALL R_test(“xytest”) WITH OVERVIEW;” is not included, no error message.

    I am not sure how to “Create new user with password” and “change root password” (http://scn.sap.com/community/developer-center/hana/blog/2012/05/21/when-sap-hana-met-r–first-kiss,

    * in

    useradd -m login_name

    passwd login_name

    what is the login_name?

    * in

    sudo passwd root

    #Assign a password

    what is the purpose to assign a password to root?

    * How can I make my SAP HANA Studio to recognize these passwords?)

    Your suggestions on how to solve the issue is appreciated!

    SET SCHEMA R;

    DROP TABLE “xytest”;

    CREATE COLUMN TABLE “xytest”(“xx” INTEGER, “yy” INTEGER);

    DROP PROCEDURE R_test;

    CREATE PROCEDURE R_test(OUT xytest “xytest”)

    LANGUAGE RLANG AS

    BEGIN

      xytest = as.data.frame(cbind(xx=11:20,yy=21:30));

      source(“/tmp/R/test.r”);

    END;

    CALL R_test(“xytest”) WITH OVERVIEW;

    (0) 
    1. Alvaro Tejada Galindo Post author

      I don’t think you can the R procedure “WITH OVERVIEW”…but you should be getting a different error message…

      Are you under a firewall that might be preventing SAP HANA to contact the R Server? Also…you cannot use the root user to call the RServe process…you need a new user…

      useradd -m login_name

      passwd login_name

      what is the login_name?

      login_name is simply a user name defined by you…it can be…

      useradd -m beiwei

      sudo passwd root

      #Assign a password

      what is the purpose to assign a password to root?

      The purpose is to be able to do some configurations that might need root authorization…let’s say you log in with your R user…without the root password there aren’t so many things you can change…

      * How can I make my SAP HANA Studio to recognize these passwords?)

      Well…SAP HANA Studio got nothing do to with this…you start your R Server and then SAP HANA Studio access it…as simple as that…

      In your code…why are you calling the source code of test.r? I don’t really see the point…also as you can see in my example…you need to insert the data into the table as simply assigning it doesn’t populate it…

      Greetings,

      Blag.

      Developer Experience.

      (0) 
  15. Vincenzo Barbaro

    Hi Alvaro!

    First of all I want to thank you for sharing your knowledge with us.

    I would like to ask you if you think that the Rserver would run also on a “simple” SUSE server.

    Thanks in advance for your answer.

    Kind regards,

    Vincenzo

    (0) 
    1. Alvaro Tejada Galindo Post author

      Vincenzo:

      Well…of course I haven’t tried it…but I don’t see why not…as far as is a server…it should be fine 🙂 SAP HANA will simply call it and maybe it will be a little bit slow…but should work…

      Greetings,

      Blag.

      Developer Empowerment and Culture.

      (0) 
  16. Ronja Queck

    Hi Blag, sorry to get back after such a long time with a (hopefully simple) question: Since R is keeping all of its data inside RAM, all analyses are limited by the RAM available on the Linux machine, aren’t they? Or is it possible to send chunks of data from SAP HANA in order to overcome RAM limitations on the Linux machine?

    (0) 
    1. Alvaro Tejada Galindo Post author

      Ronja:

      R doesn’t keep the data in RAM…SAP HANA does 😉 SAP HANA sends the data to R and since you’re using a server to run R, then you don’t need to worry to much about RAM…I haven’t tried with a really heavy load yet…but all my tests were successful and R responded very well as expected 🙂

      Greetings,

      Blag.

      Development Culture.

      (0) 
      1. Ronja Queck

        Hey Blag, R is also keeping all of its data in RAM (except if you are using special packages like ff). I am doing regression analyses which need a lot of data, but I guess I will have to use workarounds

        (0) 
        1. Alvaro Tejada Galindo Post author

          Ronja:

          Yes…you’re correct…my mistake 🙁 I need to stop assuming that only SAP HANA work in-memory 🙂 For your problem…maybe some Big Data packages or adding Hadoop to the equation might work…

          Greetings,

          Blag.

          Development Culture.

          (0) 
  17. Steve Blum

    Hi Blag,

    great blog, i like how you describe every step very detailed.

    I recently installed R myself and therefore I noticed something important about your Rserve configuration file (Rserv.conf):

    maxinbuf 10000000

    What this actually means is described here: Rserve – Binary R server – RForge.net

    The maxinbuf specifies (in kilobytes) the maximal allowable size of the input buffer, that is the maximal size of data transported from the client to the server.”

    So for your server, it is about 1 GB. You probably chose this value, because it’s mentioned in the SAP HANA R Integration Guide. But there is an explanation below in this document:

    “The value 10000000 is merely an example. We recommend that you set the value of maxinbuf to (physical memory size, in bytes) / 2048. For example if you installed R on a host with 256 GB of physical memory you should set maxinbuf to 134217728.”

    I thought, it might help 🙂

    Greetings,

    Steve

    (0) 
    1. Alvaro Tejada Galindo Post author

      Steve:

      That’s right 🙂 I didn’t even bother to change the maxinbuf value…I of course read the HANA R Integration Guide and leave the value like that…thanks a lot for taking the time to explain this…really appreciated 😉

      Greetings,

      Blag.

      Development Culture.

      (0) 
  18. Matthias S

    Great article,

    I heard somewhere that R can be triggered several times parallel from HANA.

    How I can do this. Must I change my Procedure somehow for this?

    (0) 
    1. Lars Breddemann

      Hi Matthias

      the connection to RSERVE is triggered once per R-POP (plan operator) in a calc scenario.

      You can think of one HANA R procedure as a single POP in this case.

      So, depending on your overall scenario, these POPs may be called in parallel.

      Each of these connections to RSERVE will spawn a separate independent R session.

      All this happens automatically and no special coding is required here.

      For the parallel execution within R there are literally tons of approaches, the parallel package mentioned by Blag is one of them.

      (0) 
  19. Ray Greenwood

    This blog is getting cross referenced so much it should serve as a central repository for R+HANA discussions! 🙂

    Has anyone tried a flavour of Linux other than SUSE for hosting R/RServ? Even if it isn’t supported, I am wondering if something like Ubuntu might still work well enough for some experimentation, or whether this just would not work?

    Regards

    Ray

    (0) 
    1. Alvaro Tejada Galindo Post author

      Hello Ray:

      Thanks for your kind words 🙂

      To be honest…I haven’t try it 🙁 I guess it should work fine…as long as it’s a server…let’s see if anyone else has done that already…

      Greetings,

      Blag.

      Development Culture.

      (0) 
  20. Sameer Ballani

    Hi Blag,

    I am trying to connect HANA server with Revolution R-serve (windows based). I am getting error –

    Could not execute ‘call “GET_TICKETS”()’ in 20.415 seconds .

    SAP DBTech JDBC: [2048]: column store error:  [2048] “GET_TICKETS”: line 5 col 1 (at pos 104): [2048] (range 3): column store error: search table error:  [34081] Connection error: Unable to connect with Rserve.;None of the Rserve can be connected: 172.21.99.48:7400: The connection closed unexpected; 

    This procedure works fine when I am connected to a different R-serve, so the issue is related to R-HANA integration. I am trying to find root cause and below are my findings –

    1) Revolution R and R-serve are installed on a separate Windows Based server. The guide says “This guide assumes an R installation on a Linux system, preferably SLES Linux; no other R hosting environments are currently supported.”

    Are we sure that windows based R environment is not compatible for HANA-R integration?

    Is this mentioned in PAM for HANA? If yes, can you guide me to the link.

    2) I don’t think firewall or port opening or security group is an issue. I am able to Telnet. Can you guide me on making an entry for the host name at remote server level as mentioned @ Not able to connect to RServe from Hana Studio

    3) I am checking the user running R-serve process on the server. My hunch is it is currently being run using default user. I will try to run through a different user. But how does it make a difference?

    Looking forward to your inputs. Thanks.

    Sameer

    (0) 
  21. imane black

    Hello,

    Do you know what interface is allowing the communication R and SAP HANA DB ? is it JDBC or RFC or something else ?

    Best regards,

    Imane

    (0) 

Leave a Reply