Using SAP BusinessObjects Predictive Analytics with SAP HANA, express edition
This hands-on guide explains how SAP BusinessObjects Predictive Analytics can be configured to use SAP HANA, express edition (SAP HXE) as its database.
An end-to-end scenario is configured for the trial version of SAP BusinessObjects Predictive Analytics (SAP PA) – thus allowing a full end-to-end trial experience from source data in HANA to running model in SAP PA, all running on your local machine. Where required, special information is given in case you have a valid license of SAP PA 3.0 or PA 3.1 and want to run it against SAP HXE (e.g. as a complete, self-contained demo environment on your PC).
The guide uses material & data described in other guides (e.g. SAP HXE installation guide or Hands-On Tutorial SAP Predictive Analytics, Automated Mode: Time Series Forecasting) and just fills in the glue pieces where required. This avoids duplication of information and reduces the risk that this guides becomes out-of-date soon.
Differently to the SAP HANA Academy videos on predictive on SAP HANA express, this guide is about using the tool SAP BusinessObjects Predictive Analytics with HXE rather than coding your own SQLScript procedures on top of Predictive Analysis Library (PAL).
Table of Contents
- Installing SAP HANA, express edition (SAP HXE)
- Installing SAP HANA Client and configuring an ODBC connection
- Installing SAP BusinessObjects Predictive Analytics
- Loading data into SAP HANA, express edition
- Creating the model and running the forecast
In September 2016, SAP announced the arrival of its latest brainchild in the SAP HANA family: SAP HANA, express edition (SAP HXE) . SAP HANA, express edition is a streamlined version of SAP HANA that can run on laptops and other resource-constrained hosts, such as a cloud-hosted virtual machine, for free up to 32G of memory use.
With SAP HXE thus being available on your local machine, it is tempting to set up a full end-to-end scenario together with SAP BusinessObjects Predictive Analytics, 30-day trial edition . This combination allows business analysts, data scientists and developers to play around at no cost with a complete, running scenario. Even if you have a valid license for SAP BusinessObjects Predictive Analytics (SAP PA) and/or SAP HANA, it still makes sense to set up such a scenario, since it gives you a fully self-contained environment that runs end-to-end on your PC. So even if you find yourself offline due to travels or offsite meetings, your predictive environment will always work like a charm.
In this tutorial, we’ll
- start from a naked machine and install SAP HXE and SAP PA , trial edition.
- Next, we’ll perform all necessary configuration steps to glue the two together.
- We’ll then enable a scenario that has been described in another tutorial, but instead of loading a CSV from disk, we import it as a table into SAP HXE and consume the table from SAP PA to generate a model and forecast values.
Several options exist for many steps of the process: HXE as binary installer or as virtual machine, HXE deployment to your local PC or to the cloud, SAP PA as trial license or full-fledged license, SAP PA automated and expert modes and so on. Where possible, I will cover the additional configuration steps, so that you can really set it all up to suit your needs.
For this time, I’ll concentrate on the automated mode in SAP PA only, but I am planning to extend the guide and demonstrate usage of the expert mode as well. (Note: For expert mode, you’ll need to enable the script server as demonstrated in this video. I did not manage though, seemingly due to missing RAM on my machine, but I am still investigating)
Great thanks to Andreas Forster for his great SAP PA tutorial on which I am building this guide. Thanks also to the colleagues from SAP HANA Academy for dedicating an own Youtube channel to SAP HXE! It makes your first steps with SAP HXE really simple.
Hope it helps, Jan
Installing SAP HANA, express edition (SAP HXE)
Overview on installation options
In order to install SAP HANA, express edition, you simply download it from its home page. You have the choice of two installers: HXE comes as a binary installer or as ready-to-use virtual machine image. For all of us using Microsoft Windows, the choice is simple: just download the virtual machine image, drop it on your virtual machine player (VMWare Player, VMWare Workstation or Oracle VirtualBox) and you are done. If you are on Linux, you might be tempted to consider the binary installer, but be aware that you should have SUSE Linux for SAP as OS, not plain-vanilla SUSE Linux. I did not explore the options for migrating from one to the other, so I am not sure if it’s even possible. Obviously, you can also use a virtual machine player on your Linux PC, regardless what Linux version you are using.
On the other hand, you have the choice of installing either version of SAP HXE on your PC or in the cloud. When installing on your local PC you are truly self-sufficient, regardless of network. In the cloud, others can reuse your installation SAP HXE installation, which might be nice as well.
I explored installing the VM version in the cloud, but had to give up after several failed attempts – not only do you have to enable hardware virtualization on your host OS in cloud (something that would oftentimes be switched off by default; so ask your cloud provider to switch it on or, more generally, if virtual images are supported within your cloud environment of choice), but also technicalities like network bridging from host OS to guest image needs to be enabled. In my case, I ended up with a successful install of the VM image, but it could not establish a network connection to its host OS or beyond (like e.g. to HANA Studio or an ODBC driver), rendering the whole image useless since it could only talk to itself.
I did successfully install the binary version though: I simply ordered a naked SUSE Linux for SAP image (check with your cloud provider, if they offer it) and ran the binary installer as described in the SAP HXE installation document. It all went smoothly – at least once I had enough HDD space ordered and had brushed up my Linux skills a bit.
So, if you want my advice, here it is:
- For an installation on your local PC, go for the VM image. It’s an easy install, quickly done and you are ready-to-run in no time. The few steps you might need to consider beyond the SAP HXE installation documentation are listed below
- For an installation on the cloud, do double-check if your cloud provider supports virtual machines on top of cloud images. If not, a binary installation is also quickly done, since the SAP HXE installation documentation is very explicit and mostly runs through like a charm, even for those unexperienced with Linux. But it would definitely be good to have someone at hand with medium-level Linux skills to help you out if ever you encounter roadblocks
Installation process in detail (SAP HXE VM version only)
The download process is very well described. Additionally, SAP HANA Academy’s playlist for SAP HXE covers it all in rich detail. With the download comes a detailed, step-by-step installation PDF that really worked for me and left little to be desired. Just some minor caveats (since I ran into those):
- Do check the hardware & software requirements carefully.
I, for my part, was on an older JVM version and had to upgrade that one first before I could use SAP HXE’s specialized download manager.
- Unless you already have one of the supported hypervisors (e.g. VMWare Player, Oracle VirtualBox etc.), you’ll need to install one. If you are installing SAP HXE as part of your job, you’ll need to buy a VMWare license or use Oracle VirtualBox for free (at least this is how I read the license instructions). If you are installing SAP HXE privately, you can use either and not worry about licensing (again: I am really not a lawyer at all (phew!), but this is how I understand things).
- The download manager gives you the choice between two versions of SAP HXE: server-only and server with apps. For usage of SAP HXE with SAP BusinessObjects Predictive Analytics, the server-only version is good enough
- Once the SAP HXE image is loaded, you’ll need to perform a couple of Linux commands in the running guest OS. If you have trouble typing them in the default keyboard layout (US), use YaST to change the keyword layout: type sudo yast to start YAST and then choose Hardware > Change Keyboard Configuration. Use Tab and arrow keys to navigate, F9 to save & exit YaST. This youtube video gives you all details.
- By default, my VMWare Workstation 12 Player loads the SAP HXE image with its network adapter configured as Bridged connection. This is great to give the VM full access to your host OS’ network, but has one severe implication: the VM’s IP address will change every so often when it requests a new one from your DHCP server.
There are really two ways to resolve this:
- If you rely on the IP address to communicate with the machine (this is the way recommended in the HXE installation guide, e.g. for connection setup in HANA Studio), then you somehow want to force that IP to always be the same. Since assigning a static IP is not an option (it’ll do serious havoc to your network), the best way to achieve this is by changing the VM’s network properties to option Host-only network connection. Just note down that IP (obtained in the Linux shell via /sbin/ifconfig) and rest assured it’s always the same, even on VM reboot.
- Alternatively you can edit your system’s host file (on Windows this is placed in C:\Windows\System32\drivers\etc. Find out HXE’s current IP address and note it down for a hostname of your choice in that file. Then let HANA Studio & your ODBC driver use exactly that hostname. On every reboot of the VM, you will need to update that file with the latest IP address of your HXE (/sbin/ifconfig in Linux shell), but this is the only change you’d need to do.
Apart from these small things, be sure to run through all steps to the installation PDF. In the end, you’ll end up with a running, secured SAP HXE and a SAP HANA Studio that is successfully connected to it.
Optional: Installing Automated Predictive Library (APL)
If you have an SAP PA license, you should now install the APL. This will later allow SAP BusinessObjects Predictive Analytics to delegate all data preparation, model training and model apply steps to SAP HANA, bringing the algorithms right to the data, avoiding unnecessary data transfer and boosting overall predictive performance.
If this step is not performed, all these steps will physically happen on your local PC rather than on SAP HANA. The entire model creation process is thus limited by the physical memory & CPU of your machine and overall performance will be obviously a lot lower than when leveraging the power of HANA.
Nonetheless, it is entirely possible to not use APL. Indeed it will mostly be transparent to you as a user whether APL is there or not: If APL is detected on SAP HXE, all delegation steps happen automatically without any additional user intervention; if on the other hand, APL is not detected (or another DBMS is used), the data required for preparation, processing and apply will be selected from the database, transferred to your PC and all steps be executed there.
As a side result of APL installation, user USER_APL is installed, granted all required privileges and sample data is installed. Thus, without APL installation we’ll need to perform those manually as described in the next section
Create predictive user & schema
This step is ONLY necessary, if APL has not been installed (cp. previous subsection). The APL installation automatically creates several schemas and and user USER_APL with password Password1 as access user. (Password of course should be changed in any serious settings, but for our test installation it’s simpler to leave it as it is).
Without the installation of APL, we need to create a user (SAP_PA with password Password1) by ourselves. Additionally we create a target schema where our test data is stored. We could, of course, also use SAP_PA’s own schema, but I thought I’d be good to show the full loop with a dedicated own schema as well as authorizing user SAP_PA to use it.
SAP BusinessObjects Predictive Analytics will later use that user through an ODBC connection to access the data in SAP HANA, express edition.
Steps to be performed:
- In SAP HANA Studio, select your system in tab Systems of the Systems View in the SAP HANA Development perspective.
- Right-click on the system name and choose Open SQL Console in the context menu
- Copy & paste below SQL code into the console and execute it using Execute button or by hitting F8
create user SAP_PA password Password1 NO FORCE_FIRST_PASSWORD_CHANGE; drop SCHEMA "PA_DEMO"; create SCHEMA "PA_DEMO"; grant create any on SCHEMA "PA_DEMO" to SAP_PA;
- Add the system to your system list by selecting the system in the system list and choosing Add System with Different user… as shown in the screen shot below. Once you provide user name and password, the system will appear as additional entry in your list of systems.
Installing SAP HANA Client and configuring an ODBC connection
Configuration of an ODBC connection is not part of the SAP HXE’s installation guide, but for our scenario we do need it nonetheless, since SAP BusinessObjects Predictive Analytics uses an ODBC connection to talk to any database, including SAP HANA, express edition. The respective SAP HANA Academy video covers all steps in detail, but let me give you a quick overview:
Before creating an ODBC connection, you need a running installation of SAP HANA Client Support Package Stack (SPS) 12. If you have a valid HANA license, download the latest SAP HANA Client from the SAP Software Center. If you don’t have a license, you can download a trial version of SAP HANA Client from the SAP Store. The installation is straightforward, but note that the installation package is only for Windows and Linux desktops, not Mac OS.
Configuration of the ODBC connection is equally simple:. In Windows, just type ODBC in the search box on the task bar. You can choose ODBC 64-bit or 32-bit, but I recommend to use 64-bit. Then go on to create a new System-DSN. Since your instance name is 00 and your VM has a single DB within a Multi-Database Container (MDC), the right port to use is 30013 (NOT 30015 as you would have used on a non-MDC system!). So just do as on the screenshot, but use your VM’s IP instead (or the host name you assigned in the hosts file above):
Confirm a working connection by testing via the Connect button and providing user SAP_PA and its password (or USER_APL for that matter, if you installed APL above).
Installing SAP BusinessObjects Predictive Analytics
The final installation step is now to install SAP BusinessObjects Predictive Analytics. If you already have a running installation as client-server or desktop, just use it. If your company already has a license, but you haven’t installed yet, just download the desktop version from SAP Software Center and continue. If both don’t apply, you can download a 30-day trial version from SAP Community.
Installation is very straight-forward. If you need help, please consult the standard installation documentation or the respective video at the SAP HANA Academy. One thing to watch out for is the following though: The Product Availability Matrix for SAP PA lists JRE 7 as only supported Java version. I for my part only had JRE 8 and did not run into issues. This might be relevant since the download manager for SAP HXE is JRE 8-only.
Loading data into SAP HANA, express edition
You now have all the software pieces ready for training a model with SAP BusinessObjects Predictive Analytics for data residing on SAP HANA, express edition. Problem is that SAP HXE does not contain training data yet.
This would be the time to load the data that you’d like to investigate into SAP HXE. For the sake of example, we’ll use the time-series data that is described in Andreas’ guide Hands-On Tutorial SAP Predictive Analytics, Automated Mode: Time Series Analysis. We steal its data LondonBikeHire_Extended.csv, download it to our PC and subsequently upload it to our SAP HXE instance using SAP HANA Studio.
Using SAP PA to upload the data
Essentially there are two manual ways of easily loading data into HANA: via SAP HANA Studio and via SAP BusinessObjects Predictive Analytics. Anything text (TXT, CSV, XLS) is most easily loaded via SAP PA. Of course, those can also be loaded with SAP HANA Studio, but it’s probably easier using SAP PA. If instead you happen to have an export of database tables (like e.g. in Andreas’ Data Manager Tutorial), you should use SAP HANA Studio’s import functionality.
For our case, we’ll use SAP PA to load the data. So…
- Download the data to your local PC into a folder of your choice.
- Launch SAP PA and choose Perform a data transfer:
- Find the download location of your text file and confirm with Next
- Choose analyze. Check storage type and key. SAP PA will propose an own key (KxIndex at end of list), if none is defined within the data set. For the data set at hand, no change is required. Confirm with Next.
- Exclude variables, if you don’t need them. We’ll simply take them all. Confirm with Next.
- SAP HANA, express edition is our target location for the data transfer. To achieve this, choose Data Type Data Base, choose the ODBC driver we configured above, connect to it using the user created earlier and define a new target table in our newly created schema using the dot-notation as in PA_DEMO.LONDONBIKEHIRES.
- The next screen confirms successful upload. You might encounter messages during data transfer, but as long as these are warnings only, the data is still successfully uploaded nonetheless.
- You can use SAP HANA Studio to confirm successful creation of the table and upload of the data:
Uploading the data using SAP HANA Studio
You can, of course, use SAP HANA Studio to achieve the same result:
Among others, it offers you the two primary ways of manually loading your data into SAP HANA:
- If you have CSV, XLS or XSLX, choose Data from Local File.
(If your data is TXT as in the Samples folder above, be sure to first convert it to CSV, otherwise the upload will not work).
Please follow the instructions in video SAP HANA Academy – SAP HANA Express: Sample Project: Loading Data from CSV for a step-by-step introduction.
- If you have an exported table (like, for example, the HANA data coming with), choose Catalog Objects.
For both cases, the wizard will safely guide you through all steps.
Creating the model and running the forecast
Now that the all software is installed, all connections are made and data has been uploaded to the system successfully, we can run the actual forecast.
This process is exactly as described in Andreas Forster’s guide Hands-On Tutorial SAP Predictive Analytics, Automated Mode: Time Series Forecasting, chapter EXTENDED FORECAST WITH ADDITIONAL PREDICTORS .
The guide is rich in details and screenshots, so I’ll not repeat them here. The only difference really is your choice of data source, where you obviously need to choose our newly created table, not the CSV file as in the original description:
All other steps are exactly the same, so obviously your results are as well:
In case you want to apply the model and write the results back to SAP HXE, choose Using the Model > Save/Export > Apply Model and fill out this screen:
Depending on your goals, you can not only generate the forecasts themselves, but also the time series components, residues and error bars.
Upon completion, you’ll have a new table in SAP HXE that contains your forecasts and additional information you requested. You can use this information now within your reports and your applications just as any other set of data stored in HANA.
You have now successfully run an end-to-end scenario on your own, virtualized SAP HANA, express edition environment.
Please let me know, in case you run into issues or questions, so I can improve the guide.
Excellent work, Jan!
Great tutorial, works well ! Next step is an expert mode - how to connect to HEX ?