Technical Articles
Leverage Ceph as persistent storage for SAP Data Hub
Since the Architectural changes with SAP Data Hub 2.3, NFS is no longer supported by Data Hub because it has certain limitations that prevent the distributed log (dlog) to function properly as per SAP Note 2712050 – SAP Data Hub 2.3 installation fails during validation step “vora-cluster”. Therefore, I am using Ceph instead, that is also the basis for the SUSE Enterprise Storage. This is my Ceph Dashboard:
While Ceph is easy enough to install, in this blog I am sharing how to leverage it with SAP Data Hub. To start with, I am deviating from the default setup:
In that I install ceph-deploy on my monitor node and only one additional osd node. Also, I prefer installing ceph-deploy and ntp via python-pip:
sudo apt-add-repository universe
sudo apt-get update
sudo apt-get install python-pip ntp
sudo systemctl enable ntp
sudo systemctl start ntp
sudo pip install ceph-deploy
With my storage cluster up and running, I must create two Kubernetes secrets based on the following ceph keys:
sudo ceph auth get-key client.admin | base64
sudo ceph auth get-key client.rbd | base64
This allows me to create a default storage class that will be used by my SAP Data Hub:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: dynamic
annotations:
storageclass.beta.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/rbd
parameters:
monitors: ceph-monitor:6789
adminId: admin
adminSecretName: ceph-secret
adminSecretNamespace: default
pool: rbd
userId: rbd
userSecretName: ceph-user-secret
reclaimPolicy: Retain
During the installation, SAP Data Hub 2.4.1 creates the following Persistent Volumes (the details can be found in my answer to DB Host and Size and Requirements for SAP Data Hub 2.41 on AKS):
Since replication is handled by Ceph, SAP Data Hub should be installed with the following option:
./install.sh -e vora-cluster.components.dlog.replicationFactor="1" -e vora-cluster.components.dlog.standbyFactor="0"