Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
Former Member

As some of you may have seen, I've previously done a blog on setting up PAL with HANA for the purposes of using SAP's predictive analysis. You can find the post HERE.

However there may come a point where you'd like to have access to more algorithms. This is possible by using R. R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly extensible. By using R, you gain access to many algorithms that don't currently exist in PAL.

What follows are instructions on how to install R along with Rserve on SuSE Linux, configure it to talk with HANA and then access this environment from SAP Predictive Analysis. Please note that R should NOT be installed on the same host that houses HANA. Rather it should be installed on another host. In the rest of this guide, I'll refer to the SuSE Linux server that has HANA installed as the HANA server and the server onto which we are installing R as the R server.

Install the R pre-requisites:

Start by logging in to the R server as root.

Install gcc-fortran using the command below which uses zypper:

zypper in gcc-fortran

Next install xorg-x11-devel:

zypper in xorg-x11-devel

If you are using SLES 11 SP2 then you also need to install libgfortran46. I am not but you should just need to use the command zypper in libgfortran46

Download and install R:

On the same R server, navigate to a temporary directory where you can download and extract the R server software (create a new directory if you like).

Next, download R

wget http://cran.r-project.org/src/base/R-2/R-2.15.3.tar.gz

Now extract the files

gunzip R-2.15.3.tar.gz

tar -xvf R-2.15.3.tar

Then cd into the directory

cd R-2.15.3/


Now build and install R. 

./configure --enable-R-shlib

NOTE: If you have problems with the previous line giving an error, instead run the following:

./configure --enable-R-shlib --with-readline=no --with-x=no

make clean

make

make install

Confirm R installed successfully:

Now let's confirm R did indeed get installed to /usr/local/bin

ls /usr/local/bin |grep R

Download and install Rserve

First go up one directory

cd ..

We'll now download Rserve

wget http://cran.r-project.org/src/contrib/Rserve_1.7-3.tar.gz

Now launch R

R

Now install Rserve (you will likely need to change the path/filename below from /root/Documents/temp/Rserve_1.7-3.tar.gz to the location and filename where you saved Rserve in the most recent wget step).

install.packages("/root/Documents/temp/Rserve_1.7-3.tar.gz", repos = NULL)

If everything installed correctly, you'll get no output when you run the following command:

library("Rserve")

To quit R, simply run the following:

q() 


Create an unprivileged user:

We'll now create a user under which we'll run Rserve

#yast

Select Security and Users | User and Group Management and then hit enter.

  Select Add by tabbing to it and hit Enter.

Create a user and set a password. The user will be used to run rserve. I call my user ruser


This user will start the rserve so you may want to create a special group for him or add him to an existing group that makes sense security-wise. For my basic test system, I'm just making him a member of the group users.

Next we'll create the Rserv.conf file. You can use any text editor, I personally use vi.

#vi /etc/Rserv.conf

Add values to the file (you can edit the file in a different editor / manner if it's easier for you). These numbers should be specific to your environment so I'm including information from the installation guide below (please review the installation guide for more information on these parameters):

  1. The value 10000000 is merely an example. We recommend that you set the value of maxinbuf to (physical memory size, in bytes) / 2048. For example if you installed R on a host with 256 GB of physical memory you should set maxinbuf to 134217728.

The values I used for my test system that I wrote to the /etc/Rserv.conf file:

maxinbuf 10000000

maxsendbuf 0

remote enable

Finally, save the file.

Start Rserve:

Next, switch to the user we created earlier:

su - ruser


Then start the R server the syntax is

R CMD Rserve --RS-port <PORT> --no-save --RS-encoding utf8

You'll need to choose a free port. In my case I chose port 7400. So the command I use is:

R CMD Rserve --RS-port 7400 --no-save --RS-encoding utf8

HANA configuration:

We'll now need to configure a few things from inside HANA Studio.

Launch HANA Studio and then right click on your system and select Administration (you'll likely need to use user SYSTEM for these steps or at a minimum, a very privileged user)

  Select the Configuration tab

Now you'll enable the calling of R procedures by HANA. If you want to call them from the index server, edit indexserver.ini in the steps below. For the ability to call them from SAP HANA XS, edit the xsengine.ini file following the same steps as shown below.

Select indexserver.ini | calcengine. If calcengine doesn't exist, create it.

  Right click on calcengine and select Add Parameter.

Add a parameter called cer_rserve_addresses. For the Value, enter the host or ip of your R server followed by a : and then the port number you will be using. Make sure the port is available. I'm using 7400 in my example. You can specify multiple hosts. To do so, separate them with a comma.

Add another parameter called cer_timeout. This is a connection timeout. I've set mine to 300 seconds. This specifies the maximum amount of time a single R procedure can run.

Add a final parameter called cer_rserve_maxsendsize. This is the maximum size of a result that will be returned from R to SAP HANA in kilobytes. I set mine to 0 which is unlimited.


SAP Predictive Analysis

Now launch SAP Predictive Analysis

Select New Document and choose the SAP HANA Online option

Then connect to your HANA server and select a dataset.

Once you get to the prepare screen, select the predict tab and double click on HANA R-Apriori. The HANA R prefix designates that this is using R rather than PAL.

Configure the HANA R-Apriori Algorithm. We aren't concerned here with creating a valid model, merely with testing that this is working (since you won't have the same dataset I have). So just pick a value from your dataset and change the support to 0.001.

Run it!

The following error is given:

This error is expected as we've not installed Arules. To install it, switch back to root on your R server and navigate to a directory where you'd like to save your downloaded packages (you may use the same directory we used previously).

rserver:~ # R

R version 2.15.3 (2013-03-01) -- "Security Blanket"

Copyright (C) 2013 The R Foundation for Statistical Computing

ISBN 3-900051-07-0

Platform: x86_64-unknown-linux-gnu (64-bit)

Now run our test command ( library("Package_Name") to confirm arules is not there:

> library("arules")

Error in library("arules") : there is no package called ‘a rules’

Quit R

> q()

Now go to the CRAN mirrors page and choose one near you http://cran.r-project.org/mirrors.html.

Once there go to Contributed extension packages.

Next click on Table of available packages, sorted by name.

Then search the page for the package you need, in this case we know that we are looking for arules since the error mentioned this package.

Find arules and download is (replace the URL below with the one from your mirror).

wget http://cran.stat.sfu.ca/src/contrib/arules_1.0-15.tar.gz

Finally, install the package, confirm it is there and then exit with q().

# R

> install.packages("/root/Documents/temp/arules_1.0-15.tar.gz", repos = NULL)

* installing *source* package ‘arules’ ...

** testing if installed package can be loaded

* DONE (a rules)

Now lets again test the package:

> library("arules")

Loading required package: Matrix

Loading required package: lattice

Attaching package: ‘arules’

The following object(s) are masked from ‘package:base’:

    %in%, write

> q()

Save workspace image? [y/n/c]: n

Close the error and re-run the model in SAP Predictive Analysis now that we have installed the package.

We get a new error:

However, we know the next step in fixing it now. In short, download the pmml package from a CRAN mirror.

wget http://cran.stat.sfu.ca/src/contrib/pmml_1.3.tar.gz

#R

> install.packages("/root/Documents/temp/pmml_1.3.tar.gz", repos = NULL)

ERROR: dependency ‘XML’ is not available for package ‘pmml’

Thus we exit R then download XML and install it:

> q()

Save workspace image? [y/n/c]: n

wget http://cran.stat.sfu.ca/src/contrib/XML_3.98-1.1.tar.gz

#R

> install.packages("/root/Documents/temp/XML_3.98-1.1.tar.gz", repos = NULL)

At this point on my system, I get an error:

checking for xml2-config... no

Cannot find xml2-config

ERROR: configuration failed for package ‘XML’

You might also receive this error if you don't have libxml2-devel installed on your SuSE Linux host.

If you got the error, then install libxml2-devel using the steps below:

> q()

Save workspace image? [y/n/c]: n

#zypper in libxml2-devel

Now let's try to install XML again.

#R

> install.packages("/root/Documents/temp/XML_3.98-1.1.tar.gz", repos = NULL)

Next install PMML again

> install.packages("/root/Documents/temp/pmml_1.3.tar.gz", repos = NULL)

Now go back to SAP Predictive Analysis and try to run the model again:

We have successfully configured R and used it with HANA and SAP Predictive Analysis.


3 Comments