How to Install SAP Data Intelligence on-premise v3.0,3.1 on Microsoft Azure- Part 1 of 3
This is Part-1 of a 3-part blog post series on detailed a step-by-step installation of SAP Data Intelligence on-premise v3.0/3.1 on Microsoft Azure. Thank you Skugan Venkatesan, for helping me craft this blog post series!
This Part-1 blog post shall explain and help setup the systems required to start with SAP Data Intelligence installation.
The Part-2 blog post will cover the preparations on the Kubernetes cluster and the SLCB tool(Software Lifecycle Container Bridge).
The Part-3 blog post will explain the steps needed to expose SAP Data Intelligence for end user access.
SAP Data Intelligence v3.0 and v3.1 handles deployment with containerization. The deployment target can be any Kubernetes cluster- this Kubernetes cluster can be a on-premise one or a cloud based one(AWS/Azure/GCP, etc.). The SAP Data Intelligence product is assembled together from various Docker images which in the installation process are mirrored from SAP’s registry.
To successfully follow the blog post and complete the installation, a basic understanding of the following topics is highly recommended:
- Cloud fundamentals (DNS, Networking, HTTP, Authentication, etc)
- Usage of following Azure cloud services:
- Azure Kubernetes Services
- Resource Groups
- Identity & Access Management: System assigned Managed Identity and Service Principal
- Load Balancer
- Virtual Networks- subnetting, CIDR blocks, inbound/outbound rules, etc
- Virtual Machines- opening ports,
- Kubernetes administration (Helm, Kubectl)
- NGINX Ingress controller for Kubernetes
- Putty/windows CMD with SSH/Azure CLI/Powershell
- Paid subscription to Microsoft Azure (Free account will not work, as the right virtual machines capable to run SAP Data Intelligence do not fall under the Free plan. Also in certain cases, the CPU max-limit of the selected region might also be breached)
- Kubernetes cluster on Microsoft Azure v1.14+
- Azure Container registry to store the images mirrored from SAP’s registry
- A personal workstation(Windows, MacOS or Linux) or a jumpbox with following components installed:
For a complete list of prerequisites please refer to the official SAP Data Intelligence Installation guide
Create a Kubernetes cluster in Microsoft Azure
Please follow the steps below to create a Kubernetes cluster in Microsoft Azure. Each creation step progresses, relevant screenshots will accompany to explain the details of each configuration.
- Log on to the Microsoft portal.
- Use an existing Resource Group or create a new Resource Group A resource group is a container that holds related resources for an Azure solution. The resource group can include all the resources for the solution, or only those resources that you want to manage as a group. The resource group selected here should also be selected for all other Azure resources created in this blog
- Once the resource group is created, let’s go ahead and create a Kubernetes cluster. Before creating a Kubernetes cluster, please note down the pre-requisites (Size of cluster) which must be deployed according to your environment requirements (Dev or production).
- Azure Kubernetes Services (AKS) Configuration
- Minimum 3 worker nodes (32GB RAM each node,24CPU’s). – You can choose the node size B8ms
- Minimum 4 worker nodes (64GB RAM each node, 64CPU’s) – You can choose the node type B16ms
- For detailed sizing, please refer to the sizing guide – SAP Data Intelligence Sizing Guide
- Azure Kubernetes Services (AKS) Configuration
- In Azure Portal, Search for Kubernetes services and click on it to create a new Kubernetes service
Following Parameters must be provided
- Resource Group – Resource Group that had been created before (Please select the resource group that had been created from the drop down menu)
- Subscription: Choose the required subscription.
- Kubernetes cluster name – <Any Name>
- Region – Select the region on to which the Kubernetes will be deployed.
- Kubernetes Version – Select the Kubernetes version (See installation guide for the supported versions based on the version you decide to install 3.0 or 3.1)
- Node Size and Node count – Select the size of the node and number of nodes that you require
- Azure Kubernetes Service (AKS) support for multiple node pools, now generally available, allows you to use different virtual machines sizes in each pool to run various workloads in a single AKS cluster. The feature is designed to help you better manage compute resources. In our scenario, we will go with basic node pool (1 node pool). We can always come back to azure and add another node pool when required. We will enable VM Scale Sets. You can also choose to disable the VM-Scale sets. In our case, we need autoscaling mechanism- hence enabling the same.
- In this step we need to set Identity and Access Management properties. Amongst others, there are two types of major authentication mechanisms available.
- Service Principal
- Let’s get the basics out of the way first. In short, a service principal can be defined as an application whose tokens can be used to authenticate and grant access to specific Azure resources from a user-app, service or automation tool, when an organization is using Azure Active Directory. Service principals help us avoid having to create fake users in Active Directory in order to manage authentication when we need to access Azure resources.
- A managed identity put simply- is just a layer on top of a service principal, removing the need for you to manually create and manage service principals directly. The System-assigned manage identities are tied directly to a resource and abide by that resources’ lifecycle. For instance, if that resource is deleted then the identity too will be removed. User-assigned managed identities are created independent of a resource, and as such can be used between different resources. Removing them is a manual process whenever you deem fit.
- In our installation scenario we choose to go with System assigned Managed identity. Managed identities to authenticate to any supported Azure services that supports Azure AD authentication including Azure Key Vault. Managed identities can be used without any additional cost. For further details on this please refer this link.
- Managed Identity (system assigned)
- Service Principal
- For other scenarios where, if you decide to go with Service principal, a user with appropriate privileges, or an administrator can create a new service principal ID and Secret. This Service Principal ID should be added as a user in the Docker Image registry that we will be creating and for the Kubernetes services as well.
- This service principal ID and Secret will be passed later from the Jump host/Installation host to connect to other cloud resources like Azure container registry which is then attached to the cluster.
Also, we can enable RBAC to ensure we have full and proper access to the Kubernetes cluster.
For authentication method, Select Service Principal and give the required Service Principal name or allow the system to create a default service principal for you (if you have privileges)
- Kubernetes cluster networking setup
- Select Advanced Networking
- Create a new virtual network and assign a new address space <10.0.0.0/16>
- Kubernetes Service Address Range – <172.16.0.0/24> – Please define an address range for the Kubernetes cluster services. Ensure this does not overlap with the Subnet IP’s that you had configured.
- Kubernetes DNS service IP address: Single IP address that will be assigned to Kubernetes DNS Service
- Docker Bridge address: The –docker-bridge-address lets the AKS nodes communicate with the underlying management platform. This IP address must not be within the virtual network IP address range of your cluster and shouldn’t overlap with other address ranges in use on your network.
- Private Cluster or Public Cluster – In our scenario we are going ahead with a Public Cluster. But depending upon requirements, you can also choose to have this as a Private Cluster. When you choose Private Cluster, this will not be exposed. We are not covering private cluster in this blog, but only minor changes are applicable for this to work with private cluster.
- Network Policy. – You can choose Calico or Azure CNI. In our Scenario we have chosen Azure CNI. Please note you can even integrate both CNI and Calico depending upon your requirements but integrating them is not that simple.
- HTTP Application Routing – No
- Kubernetes Integration settings/configuration
- Any registry can be used (for example: Docker registry, AWS Container Registry etc.,). In our scenario we are using Azure container registry. For whatever registry you are using, please make a note of the registry URL, user ID and credentials used to access the registry. Also, in case you are not using service principal ID, then do enable the ADMIN USER while creating a new docker container registry.
- Container monitoring – If individual containers must be monitored, then enable the container monitoring.
Validate the Kubernetes settings before creation. As once the Kubernetes cluster is created, there is no way to change some of the core settings. You must delete/re-deploy the cluster again. Ensure and re-check the parameters and configurations are in place.
Open the Resource Groups in Azure portal and find the resource group which was recently created. Click the displayed name and you shall enter the configuration area.
Click subnets on the left pane. Here we shall create subnets. A subnet is essentially a range of IP addresses in a virtual network. To keep it simple, here we have defined just 2 subnets- one for the Kubernetes cluster and the other one for the jumpbox system (to be used as installation host).
Create Jumpbox or Virtual Machine
Let us now setup the jumpbox. Jumpbox is a system on the same network as our destination cluster to enable controlled access for any installation/administration requirements. Go to the search bar in azure portal and search/select Virtual Machines. There are various series of VM’s available in Azure, namely- General purpose, compute optimized, memory optimized, storage optimized, GPU, and high-performance compute. As the jumpbox will only be used for installation, we can take a VM from General Purpose series: the “Ds v4” series- D2s, i.e.- here the 2 denotes the number of virtual CPUs. The SSH port will be used to login to the jumpbox, and all further steps of system setup and Kubernetes configurations will be done via the SSH command line interface only.
The jumpbox will be accessed from your laptop, hence it must have a public IP assigned. In the Networking tab of VM creation pane, select the public inbound port as 22 i.e.- the SSH port. You can associate a network security group with virtual machines, NICs, and subnets, depending on the deployment model you prefer. Here we shall use the basic NIC network security group only.
After completing all steps based on your need, ensure that the validation passes and create the Virtual Machine. It might take around 30 seconds for the VM deployment to be completed.
Now using Putty, powershell or the windows command prompt- log on to the Jumpbox.
Now to access various Azure resources from the jumpbox, let’s start with installing the right packages needed for the Azure-CLI install process:
sudo apt-get update
Now to use certificates, the appropriate packages must be installed- “ca-certificates” in ubuntu.
sudo apt-get install ca-certificates curl apt-transport-https lsb-release gnupg
Download and install the Microsoft signing key:
curl -sL https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor | sudo tee /etc/apt/trusted.gpg.d/microsoft.asc.gpg > /dev/null
Add the Azure CLI software repository:
AZ_REPO=$(lsb_release -cs) echo "deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ $AZ_REPO main" | sudo tee /etc/apt/sources.list.d/azure-cli.list
Update repository information and install the azure-cli package:
sudo apt-get update sudo apt-get install azure-cli
Now that Azure-cli is setup, let us login to the azure subscription post account authorization.
To complete the authorization for the CLI access, the portal will use the browser and provide with a code to login.
Now login is complete, let us set the active subscription.
# az account set –-subscription ”<your-azure-Subscription-Name>” az account set --subscription "Visual Studio Enterprise"
Now, the jumpbox is setup with the required packages, and post login to the azure subscription all cloud resources can also be accessed.
sudo apt-get update && sudo apt-get install -y apt-transport-https
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
echo "deb https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee -a etc/apt/sources.list.d/kubernetes.list
sudo apt-get install -y kubectl=1.15.10-00
Installation of Docker
sudo apt-get install docker.io
Just to check everything went well let’s check the installation using docker login command:
Let’s also install a YAML parser and emitter for python for the Ubuntu VM.
sudo apt-get install python-yaml
The helm doesn’t have a package in Ubuntu repositories, but you can just download the desired version from GitHub and unpack it on the server.
wget https://storage.googleapis.com/kubernetes-helm/helm-v2.11.0-linux-amd64.tar.gz --content-disposition
Now extract the files
tar -xf helm-v2.11.0-linux-amd64.tar.gz
sudo mv helm tiller /usr/local/bin
Congratulations, if you have made this far! The blog post has been purposefully kept very detailed so that readers with varying skillsets can follow it easily. Please watch out for part-2: we shall cover Kubernetes configuration and the SLCB setup. Stay tuned!
Nice Blog - don't miss the Blog - https://blogs.sap.com/2020/01/22/unified-data-integration-for-sap/
it contains several separate Articles around Jumphost and SLC Bridge as well.
Best Regards Roland
Thanks Roland Kramer ! I have 2 more parts coming up in this blog post series which shall delve into the SLCB and ingress controller. I have been following your blogs since a long time- thanks! 🙂