Product Information
Learning SAP Data Hub 2.x
Once you have a chance to read my colleague Marc Hartz’s blog on SAP Data Hub 2.3 you may pursue multiple options to learn SAP Data Hub 2.x starting with 2.3. This blog summarizes various ways you as a customer and/or a partner and/or a developer can learn SAP Data Hub 2.3. The options are:
- Watching videos at SAP Data Hub Channel. Ideal for all to learn product features and functionalities watching videos at own schedule.
- Attending SAP Data Hub hands-on sessions at SAP TechEd – Ideal for consultants to learn the product with hands-on exercises supervised by instructors at events.
- Arranging SAP Data Hub Technical Academy session(s) contacting your SAP account team – Ideal for customers to learn the product with hands-on exercises supervised by instructors at your own facility or SAP locations
- Starting a fully functional Data Hub 2.3 trial Instance at Google Cloud Platform – Ideal for customers and partners to start a rapid prototype to explore all functionalities.
- Using Data Hub Developer Edition to build your pipeline – Ideal for developers to build operators and pipelines in a community environment..
- Registering to SAP Data Hub Webinars – Ideal for all to learn the product attending webinars live or watching recording.
- Taking OpenSAP course –Ideal for all to learn the product free with hands-on exercises at own schedule guided by instructors.
- Subscribing to The Stay Current Program – Ideal for all to learn the product at your own schedule but a subscription is needed.
Starting with SAP Data Hub Channel is probably the easiest of all these options. Then based on your requirement, use one or combinations to learn SAP Data Hub.
1. SAP Data Hub Channel
Key functionalities of SAP Data Hub are data governance, data pipeline and data orchestration. We published videos showing these capabilities of SAP Data Hub 2.3 along with customer end-to-end stores at SAP Data Hub Channel. These SAP Data Hub 2.3 functionalities in data governance, data pipeline and data orchestration are demonstrated with a series of six videos and they are available here
SAP Data Hub helps manage your data across different systems by using the Metadata Explorer. The Metadata Explorer gathers information about the location, attributes, quality, and sensitivity of data. With this information, you can make informed decisions about which datasets to publish and determine who has access to use or view information about the datasets. Data governance starts with this Metadata Explorer.
In general, Metadata Explorer is used to:
- preview data in the datasets
- create indexes about the dataset contents to aid in searching for datasets
- profile data to view information about the contents of different datasets
- publish datasets to allow others to view and search the data
- label the datasets with keywords, which also helps in searching for datasets
The following demo highlights the capabilities of the Metadata Explorer as part of the SAP Data Hub 2.3 release. It illustrates how the Metadata Explorer enables interactive browsing through connected systems, exploring the contained data, creating a catalog and searching through the data. It also shows how this instance of Data Hub is deployed via the Google Kubernetes Engine.
The SAP Data Hub Modeler tool is based on the SAP Pipeline Engine that uses a flow-based programming paradigm to create data processing pipelines (graphs).
Big Data applications require advanced data ingestion and transformation capabilities.
Some common use cases are to:
- Ingest data from source systems. For example, database systems like SAP HANA, message queues like Apache Kafka, or data storage systems like HDFS or S3.
- Cleanse the data.
- Transform the data to a desired target schema.
- Store the data in target systems for consumption, archiving, or analysis.
Users can model data processing pipelines as a computation graph, which can help to achieve the required data ingestion and transformation capabilities. In this graph, nodes represent operations on the data, while edges represent the data flow.
The SAP Data Hub Modeler tool helps users to graphically model and execute a graph. The tool also provides a runtime component to execute graphs in a containerized environment that runs on Kubernetes.
The capability to develop multiple data pipelines with SAP Data Hub Modeler are demonstrated with the four following videos after a business scenario is presented.
In the following video a Python 2 operator performs sentiment analysis.
The following video shows how sentiment analysis data can be persisted in SAP Vora with a pipeline.
After creating pipelines, a workflow is created to orchestrate these pipelines and Vora Tool is used to visualize the processed here.
Data Workflows is used to orchestrate multiple tasks and execute them in a given order. These tasks could be external or internal tasks. External tasks include
- Trigger execution of a process chain on a BW system
- Transfer data from a BW system into Vora tables (created on the fly)
- Execute remote data services jobs
- Trigger execution of a HANA flowgraph using SDI REST API (XSC)
- Submit Spark jobs, Hive queries, etc. to Hadoop cluster
Internal pipeline includes
- Start a pipeline on a local or remote SAP Data Hub Pipeline engine
- Wait for completion of pipeline (or if set continue immediately)
- Run relational transformations (join, union, filter, etc.) on structured data (tables, CSV, Parquet, etc.)
The following video demonstrated data orchestration capabilities both external and internal in SAP Data Hub 2.3.
After understanding data governance, data pipeline and data orchestration with above six videos, you can continue learning about SAP Data Hub 2.3 capabilities watching another series of videos which are created from exercises offered at 2018 SAP TechEd Hands-on sessions. They are available here.
In this series, the last video shows few troubleshooting graphs in SAP Data Hub 2.3.
2. SAP Data Hub Hands-on Sessions at SAP TechEd
Hands-on sessions offered at Las Vegas TechEd are being offered at Barcelona and Bangalore too. Please check out this blog for additional information. You can meet SAP Data Hub experts at these hands-on sessions for additional information.
3. Arranging a SAP Data Hub Technical Academy session contacting your account team
SAP offers 1 day SAP Data Hub Technical Academy workshop at various SAP locations inviting local customers and partners. SAP often offers similar sessions at our customer locations. We request you to contact your account team to arrange such sessions where your team can learn SAP Data Hub 2.3 from SAP Experts.
4. Starting a fully functional Data Hub 2.3 Trial Instance at Google Cloud Platform
If you are a customer and/or a partner, you can launch a fully functional SAP Data Hub 2.3 trial instance at Google Cloud Platform where you pay Google for infrastructure but get access of SAP Data Hub software free. SAP Cloud Appliance Library offers SAP Data Hub 2.3 at Google Cloud Platform. You may download Getting Started Document
Please follow this step by step process documented in this tutorial on how to set up your trial instance at Google Cloud Platform. Once your instance is up and running you may continue with the following tutorials.
- Explore Navigation
- Learn Discovery Part1 & Part2
- Orchestrate Workflow Part1, Part2, Part3 and Part4
- Build Pipeline Part1, Part2, Part3, Part4 and Part5
5. Using Data Hub Developer Edition to build your pipeline
SAP Data Hub 2.3 Developer Edition can be download on your laptop to develop data pipelines. Please check out this blog on how to start your SAP Data Hub Developer Edition journey.
6. Registering to SAP Data Hub Webinars
Please check out this blog for additional information on coming webinars available to you and how to register.
7. Taking OpenSAP course in SAP Data Hub
Please check out OpenSAP course to learn SAP Data Hub free.
8. Subscribing to The Stay Current Program
The Stay Current program SAP Data Hub 2.3 for Technical Consultants – Stay current is now available for customers and partners on the SAP Learning Hub since November 30, 2018. Please note, that a subscription is required to access content from this program. You can also find a teaser in SAP Learning Hub: DISC_EKT_DATAHUB_SC_EN_23.
Thank you for reaching here and happy learning!
Hello Swapan,
Thanks you for the blog post. It's exciting to see the trial version is available for hands on experience. I am facing some challenges while creating an instance in the Google cloud. The screen I receive , is different from that is available in the guide.
where can I get support in this regard.
Thanks
Hi Asis - The videos shared here are based on our 2018 TechEd systems which have out of the box connected systems. If you launch a Data Hub at GCP using our trial system, it will not have any out of the box connected systems. Please connect me directly on how you can access TechEd systems within SAP network.
Thanks,
Swapan
Open SAP content now available Asis Pattnaik