Technical Articles
[Quick Start Guide – Part II] Installing SAP Data Intelligence on Red Hat Openshift
This is Part II in a two-part series detailing how to install and configure SAP Data Intelligence (SDI) upon a Red Hat OpenShift cluster. Part I covered the background and prerequisites of setting up your environment, preparing the OCP cluster for SDI and deploying the SDI Observer. In this chapter, we look at how to perform the actual SDI installation, as well as the tests required to verify your installation and the setup. By the end of this two-part post, you will have a SAP Data Intelligence workspace running upon an OpenShift cluster. Special thanks to Markus Koch and Michal Minar for testing and validating & providing the technical content for this article.
SDI installation
This step will show you how to do the actual SDI Installation.
- Return to the Maintenance Planner browser Tab.
Note
The bridge has to be kept open in an active window while working with (MP).
- Click Next and then Deploy.
- If you see the following:
Switch back to the SLCB browser tab. You will see this:
Click OK.
- Enter Your S-User credentials and click next.
- Select “SAP DATA INTELLIGENCE 3 – DI Platform Full” and click next.
- Enter the OpenShift Namespace where the SDI should run in. In this case it is
sdi
. When this is done, click Next. - Select
Advanced Installation
and click Next. - Enter a password for the System Tenant Administrator.
- Enter the Default Tenant name.
- Enter the Default Tenant Adminstrator name and password.
- As our cluster has direct access to the internet, we do not need to set proxies. If this is not the case for you, see this guide for details on how to proceed.
- Disable backup. If this step is not needed, see SAP Note 2918288. Note that the object storage infrastructure NooBaa cannot be used as backup media if Vora is used. To disable backup, be sure to remove the check mark.
- Enable the Checkpoint store. Ensure that the checkmark is set.
Select
S3 Compatible object store
.Use the name and credentials for the checkpoint store which were created earlier. Note that the endpoint for NooBa S3 is always s3.openshift-storage.svc.cluster.local.
Note that it may take some time for the validation to finish processing, even if your cluster is setup correctly. In the unlikely event that it does fail, check that you used http and not https. With private certificates this step may not work.
- Continue with the defaults on the next screens. Use the default storage class for persistent volumes.
You can leave the custom container log path box unchecked.
Enable Kaniko.
You do not need a different container image repo for demo purposes.
‘Enable kernel module loading’ can be left unchecked as this has already been actioned by the installer, but if you do decide to check it the process will be unimpacted.
Leave defaults.
- Change the clustername to sdidemo-ghpte-$GUID and replace with your lab GUID.
The following is a summary of the installation parameters.
- Start the installation procedure. After installation the following screen will appear.
Note
Make sure you write down or save your System ID.. In this example it is
11bw3dz.
Post Installation work
Getting Access to the SDI console
We have configured the sdi-observer to make the route to the admin interface available. You can check this with the following command:
# oc rollout status -n sdi-observer -w dc/sdi-observer
If sdi-observer has exported the route correctly, the system will return with this command:
replication controller "sdi-observer-2" successfully rolled out
You can double check with
# oc get routes -n sdi
The output should be
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD vsystem vsystem-sdi.apps.cluster-1251.1251.example.opentlc.com vsystem vsystem reencrypt/Redirect None
You can now access the SDI management console at https://vsystem-<SDI_NAMESPACE>.apps.<cluster_name>.<base_domain>
. In this example it is https://vsystem-sdi.apps.cluster-1251.1251.example.opentlc.com
Configuring the Connection to Data Lake
- Login to the SDI Console at
https://vsystem-<SDI_NAMESPACE>.apps.<cluster_name>.<base_domain>
. Use the tenantdefault
, userdefaultadmin
and the password from the installation procedure detailed above.
- Click the Connection Management tile.
- Click on
+
- Enter the following values and click
Test Connection
:
Parameter | Value |
---|---|
Connection Type | SDL |
Id | DI_DATA_LAKE |
Object Storage Type | S3 |
Endpoint | s3.openshift-storage.svc.cluster.local |
Access Key ID | from above (see storage-credentials.txt) |
Secret Access Key | from above (see storage-credentials.txt) |
Root Path | from above (see storage-credentials.txt) |
- If the connection test is successful, click on
Create.
- Upon successful completion a notification will appear.
SDI validation
To validate your installation of SAP Data Intelligence, see SAP Data Intelligence Installation Guide .
Return to the Data Intelligence Launchpad (or login like in the previous step).
Defining a pipeline
- Launch the Modeler by clicking the Modeler tile. Note that this step may take a while.
- Enter
com.sap.demo.datagenerator
in the search field and click on Data Generator. - Save the configuration.
- Start the graph.
- Check that the status changes to ‘Running’ (this may take several minutes).
- Open the Terminal user interface by right-clicking upon the Terminal operator and selecting
OpenUI
. - Once the Data Generator is running the following will be displayed. If not, you will see an error dialogue.
- Stop the graph once you observe its output in the Terminal.
Checking Your Machine Learning setup
- To create an ML scenario, open
ML Scenario Manager
from the SAP Data Intelligence LaunchpadNote that this step may take a while.
- Click
Create
. Enter a name for your scenario and a business question (this second question is optional). ClickCreate
.
The details for your scenario will appear and your scenario will be added to the list of ML scenarios on the overview page.
- On the details page for your scenario, click
Create
in the Notebooks tab to create a new Jupyter Notebook. In the Create Notebook dialog box, enter a unique name for your notebook (this step is optional). Enter a description of your notebook, then clickCreate
to produce your Jupyter notebook.Note that this step may take a while.
- At this stage, your notebook will open in JupyterLabs. You will be prompted to select your kernel. Choose Python 3.
- In your JupyterLab notebook, copy the following code into a cell and run it:
import sapdi from hdfs import InsecureClient client = InsecureClient('http://datalake:50070') client.status("/")
Check that the code runs without errors.
The code should return JSON, similar to the following:
{'pathSuffix': '', 'type': 'DIRECTORY', 'length': 0, 'owner': 'admin', 'group': 'admin', 'permission': '777', 'accessTime': 0, 'modificationTime': 1576237423061, 'blockSize': 0, 'replication': 1}
Conclusion
Congratulations – you’ve successfully set up SAP Data Intelligence (SDI) upon a Red Hat OpenShift cluster. In the first chapter, you gained insight into the high level installation workflow of setting up SDI on OpenShift and learnt how to set up your environment, prepare the OCP cluster for SDI and deploy the SDI Observer, and in this chapter you performed the actual SDI installation, as well as the tests required to verify your installation and the setup. Well done!
If you have feedback or thoughts, feel free to share them below in the comment section. For the latest content on SAP Data Intelligence, Red Hat and OpenShift, do subscribe to the tags and to my profile (Vivien Wang) for more exciting news in this space.
Vivien Wang is currently an Ecosystem Partner Manager for the Red Hat Partner Engineering Ecosystem.