Connect your SAP Data Hub to SAP Vora and Hadoop

architectSAP · ‎11-28-2017

With the SAP Data Hub enabled on my SAP HANA, express edition and connected to BW and HANA I want to connect it to SAP Vora and Hadoop next:

Originally I had intended to use the SAP Vora Developer Edition, but that is currently based on SAP Vora 1.4 so I go for Vora 2.0 on Kubernetes with Minikube in GCP. The installation on the Google Cloud Platform is straight forward and well supported by a series of SAP HANA Academy You Tube videos:

As a result, I retrieve the Vora ports needed for the SAP Vora Spark Extensions installation on my Hadoop cluster later:

root@vora2:~/SAPVora-DistributedRuntime$ ./install.sh -s

############ Ports for external connectivity ############

vora-tx-coordinator/tc port:                        31213

vora-tx-coordinator/hana-wire port:                 31950

vora-catalog/catalog port:                          32635

vora-tools/tools port:                              31304

#########################################################

Next, I install the Hortonworks Data Platform 2.6 on SUSE Linux Enterprise Server for SAP Applications because unfortunately the respective VMware image does not work with the latest SAP HANA Data Provisioning Agent. Again, the installation with Ambari is straight forward, with two slight deviations from the installation manual:

Set Up Password-less SSH

ssh-keygen

ssh-copy-id -i ~/.ssh/id_rsa.pub HOSTNAME_OR_IP

ssh HOSTNAME_OR_IP

Edit the /etc/ambari-server/conf/ambari.properties file and add the following line to the end of the...:

security.server.disabled.ciphers=

TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384|TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384|

TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA384|TLS_ECDH_RSA_WITH_AES_256_CBC_SHA384|

TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA|TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA|

TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA|TLS_ECDH_RSA_WITH_AES_256_CBC_SHA|

TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256|TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256|

TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA256|TLS_ECDH_RSA_WITH_AES_128_CBC_SHA256|

TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA|TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA|

TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA|TLS_ECDH_RSA_WITH_AES_128_CBC_SHA|

TLS_ECDHE_ECDSA_WITH_3DES_EDE_CBC_SHA|TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA|

TLS_ECDH_ECDSA_WITH_3DES_EDE_CBC_SHA|TLS_ECDH_RSA_WITH_3DES_EDE_CBC_SHA|

TLS_ECDH_anon_WITH_AES_256_CBC_SHA|TLS_ECDH_anon_WITH_AES_128_CBC_SHA|

TLS_ECDH_anon_WITH_3DES_EDE_CBC_SHA|TLS_ECDHE_ECDSA_WITH_NULL_SHA|

TLS_ECDHE_RSA_WITH_NULL_SHA|TLS_ECDH_ECDSA_WITH_NULL_SHA|

TLS_ECDH_RSA_WITH_NULL_SHA|TLS_ECDH_anon_WITH_NULL_SHA

Subsequently the installation finishes smoothly with all the services that I need:

So, I continue with Installing SAP Vora on the Hadoop Cluster:

linux-p2i7:/home/frank/SAPVora-SparkIntegration # ./install.sh

SAP Vora 2.0 Spark Integration Installer: INFO     SAP Vora 2.0 Spark Integration Installer

Cluster Manager (MapR, Cloudera, Ambari, Bare, FusionInsight) [ambari]:

Install support for Spark 1.6.x? [Y/n]

Install support for Spark 2.x? [Y/n]

SAP Vora 2.0 Spark Integration folder, must start with /opt/ [/opt/vora-spark]:

HDFS upload folder for SAP Vora 2.0 Spark Integration [/user/vora/lib]:

OS user for HDFS access [hdfs]:

Path to folder with datanucleus jars [/usr/hdp/current/spark-client/lib]:

SAP Vora 2.0 Spark Integration Installer: INFO     Installing SAP Vora 2.0 Spark Integration



Attention: File Access

For the install process to run correctly, the files /tmp/vora-spark/lib/spark-sap-datasources-spark1.6.jar

and /tmp/vora-spark/lib/spark-sap-datasources-spark2.jar must be accessible to the hdfs user.

Please make sure that it can access the configuration.

Do you think the user has access? [y/N] y

Do you want specify the connection parameters for the SAP Vora Kubernetes cluster? [Y/n]

Transaction coordinator host: 35.189.104.251

Transaction coordinator port: 31213

Catalog host: 35.189.104.251

Catalog port: 32635

Catalog timeout in seconds [6]:

Do you want to configure authentication to the Vora cluster? [Y/n]

Vora authentication username: vora

Vora authentication password: Pr0file!

Path to folder where v2auth.conf is stored: /opt/vora-spark

Owner of the v2auth.conf file [vora]: root

Group of the v2auth.conf file [root]: root

Ambari User ID: admin

Ambari password:

Ambari cluster name: Sandbox

Ambari cluster address [http://localhost:8080]:

SAP Vora 2.0 Spark Integration Installer: INFO     Running: /usr/bin/hdp-select versions

Hortonworks version [2.6.3.0-235]:

Path to host file [/home/frank/SAPVora-SparkIntegration/lib/../config/hosts.txt]:

SAP Vora 2.0 Spark Integration Installer: INFO     Reading host file at /home/frank/SAPVora-SparkIntegration/lib/../config/hosts.txt

SAP Vora 2.0 Spark Integration Installer: INFO     Parsed 1 hostnames from hostfile

Followed by Install SAP HANA Data Provisioning Agent:

./hdbinst --batch --path /usr/sap/dataprovagent --user_id=<dpagent>

And then Install SAP Data Hub Adapter:

./hdbinst --batch --path /usr/sap/dataprovagent/bdh --hadoopConfDir /etc/hadoop/conf --voraHome=/opt/vora-spark --sparkConfDir /usr/hdp/current/spark-client/conf

However, this adapter needs upgrading by replacing the respective jar file with the latest patch:

linux-p2i7:/home/frank # cp adapter-core/package/lib/adapters/com.sap.bdh.adapter.adapter-core-1.1.20.jar /usr/sap/dataprovagent/adapters/com.sap.bdh.adapter.adapter-core.jar

As a result, I can register my New System:

And two Connections, namely HDFS:

And VORA Catalog:

Finally, I discover the content of these connections:

In my next blog, I will leverage this configuration to Define a Data Pipeline.

Connect your SAP Data Hub to SAP Vora and Hadoop

SAP PI for Beginners

ABAP 7.40 Quick Reference

Fiori: technical installation and configuration of one app from A - Z