Exploring the New Capabilities of SAP Datasphere for Unleashing the Power of Data
Context and Introduction
In today’s fast-paced business world, making sense of data can be a daunting task. But don’t worry, SAP has got your back with their latest innovation: SAP Datasphere. This technology is designed to make your life easier by seamlessly integrating, cataloging, modeling, and virtualizing data from various sources, including SAP and non-SAP systems.
- Application and Data Integration
- Semantic Modeling
- Data Cataloging and Quality
… and explain how they can help businesses to unleash the power of data.
1. Application and Data Integration
Integrating data can be a challenge, especially when it comes to bringing together SAP and non-SAP data. However, with SAP Datasphere, you can seamlessly integrate data from both sources.
SAP Datasphere can help you to reduce integration complexity and get an integrated view of information, regardless of where it’s stored or how it was designed. This means you can spend less time worrying about the technical aspects of integration and more time using your data to make informed decisions for your business.
So why not get started on our exciting journey by creating a Replication Flow today? The first step is to establish a connection to your source system. Once you’ve done that, you’ll be well on your way to simplifying your data replication process and saving time and effort.
1.1. Gathering SAP HANA Cloud Information and Initial Setup
In our example, we’ll need to work with the SAP HANA Cloud Instance (HCD). And the best part? Getting everything you need is super easy! All you need to do is “Copy SQL Endpoint”. This nifty feature will provide you the host and port information required to connect to SAP HANA Cloud.
The next piece of the puzzle is your user and password. Simply add these details to your connection settings in SAP Datasphere, and you’re good to go!
To start, click on “User Management” and create your user. If you already have one, you can use it, just like in my case. For this article, you only need access to your own schema, but you can manage the privileges as you desire.
Now, let’s look at what you can use in SAP HANA Cloud:
Our scope will be comprised of three (3) tables as follows:
- ACTUAL_DEMAND: This table contains all historical demand units. This is the actual data I have.
- DEMAND_CHANNEL: This table contains all the demand channels descriptions.
- DEMAND_PREDICTIONS: This table provides predictions for a horizon of 15 days (note that this table will be used only for replication part of this blog, it could be used in further analytics developments but the idea was to demonstrate that you may compose a strategy for long-term modeling from the replication capabilities we’ll experiment here).
1.2. Creating SAP HANA Cloud Connection in SAP Datasphere
Before we proceed, it’s important to note that at the time of writing, replication flows are only able to copy certain source objects. Please keep this in mind as you begin using this feature.
- CDS views (in ABAP-based SAP systems) that are enabled for extraction.
- Tables that have a unique key (primary key)
- Objects from ODP providers, such as extractors or SAP BW artifacts.
Now, let’s move into SAP Datasphere and Create a Connection to the SAP HANA Cloud. Simply use the information you have and you’re good to go!
Here you just use the information you already got and that’s it. Pretty simple, hun?
1.3. Creating a Replication Flow in SAP Datasphere
The next step is to create a Replication Flow for copying the data from those 3 tables into SAP Datasphere. To do this, go to the “Data Builder.”
Notice that I don’t have any file in my environment! We’ll see new artifacts very soon in there. But for now, you may click on “Replication Flow” and the following screen will show up:
Note: If you cannot see the “Replication Flow,” check if you have the “SAP Datasphere Integrator” role.
To choose the SAP HANA Cloud connection, click on “Select Source Connection.”
When you select it, you’ll need to choose the Source Container.
Select your own schema “DEVELOPER,” and finally, by clicking on “Add Source Objects,” pick the tables you want.
In the next screen, select all the objects and “Add Selection.”
Next, configure the target (“Select Target Connection”). Also, note that you should choose “SAP Datasphere” as your target when you want to replicate data from SAP HANA Cloud into it.
We did it!
Let’s change the name of the replicated tables in SAP Datasphere.
You may choose the “Settings” to set the replication behavior for your tables:
The load type can be:
- Initial Only: Load all selected data once.
- Initial and Delta: After the initial load, the system checks for source data changes (delta) once every 60 minutes and copies the changes to the target.
First things first, let’s give a good name for this replication flow – let’s call it “Demand Data Replica”. Although I’ll be using “Initial and Delta” in this article, you can choose any option you prefer.
Once you select the row related to any table, the replication properties will show you a very good set of information about it. It’s important to note that the functionality is still pretty new, so it’s worth taking a closer look at it.
In my case, I noticed that it was also showing the fields I didn’t want to replicate (the weekday_*).
To avoid replicating these fields, we can use the “Add Projection” option. Just remember that it won’t show up if you don’t click on any row.
You can create filters based on the available fields and/or map the fields you want (which is what I’ll do in this case).
As you can see, it’s easy to not select the fields you want. Once you’ve made your selections, don’t forget to give your replication flow a name. In my case, I named it “No Weekday Features”.
Now that we’ve completed our replication flow, let’s save it and deploy it.
Let’s see what happened in our Data Builder home page.
When we created our replication flow, all the tables (physical ones) were automatically created in SAP Datasphere. So now you may be thinking, “Well, let’s see the replicated data!!”. Ok, let’s do this!
1.4. Running a Replication Flow in SAP Datasphere
In “Tools”, you can then select the monitor.
Looks like we have a lot of things in here, but don’t worry, it’ll only take a few seconds to figure it out. Once the loading is complete, the message will show up like this:
Now, let’s take a look at the tables and see if we can spot some data.
Great news, we got it!
You should be proud of yourself; you got all the source tables you need to do some Analytics stuff! But here comes the question: is there an easy way to make your data more accessible and insightful for consumption in SAP Analytics Cloud?
2. Semantic Modeling: Analytic Models
The answer is Analytic Models!!!
Analytic models in SAP Datasphere are the perfect solution for that! It’s the foundation for making data ready for consumption in SAP Analytics Cloud. These models provide a multi-dimensional framework that enables you to answer different business questions and create meaningful insights.
To create an analytic model, you need to use an Analytical Dataset as a source. These datasets contain dimensions, texts, and hierarchies, and form the basis for building models. Fields with semantic type “Amount with Currency” and “Quantity with Unit” can have measures added to the model, which automatically include quantity or unit information.
To create an analytical dataset, we will use a Graphical View based on the tables we had replicated. It’s important to note the differences between an analytic model and an analytical dataset, which you can find in our “Comparison of Analytic Model and Analytical Dataset” help page.
If you understand analytic models and analytical datasets, you can build a strong foundation for data analysis and make informed decisions.
Well, let’s create a graphical view then.
2.1. Creating a Graphical View
Let’s drag and drop the table with actual demand data to Add a Source.
Here you will be able to customize your view. You can rename columns, add semantics, and labels as needed. Choose the options that work best for your use case.
As it is analytical, at least one measure must be defined.
And for our purposes, using the “Change to Measure” option is enough.
To create an association between two data entities, go to the Associations section of the side panel in your table or view and click on the Create button. This will allow you to create a relationship between the current table or view and another table or view in your data layer.
You can create an association between any two tables or views, regardless of their level in the data layer. This means that you can define relationships between analytical datasets and dimensions, among other consumable views.
Let’s create an association together by selecting “Channel” as our association!
To make things more interactive for the user when running a view, we can add an input parameter that prompts them to enter a value. This value can then be used to filter or perform other calculations.
If the view is used in an SAP Analytics Cloud story, the user will be asked to enter a value for the input parameter in the “Set Variables” dialog.
Just for fun, let’s create an input parameter together!
Creating an input parameter is great, but if you don’t use it within your model, it won’t be very helpful. So, we need to make sure to utilize the input parameter somewhere in the graphical view. One way to do this is to create a filter that uses the input parameter as a value.
This will allow you to apply the filter based on the user’s input and make your view more dynamic and interactive. So, let’s get started and create a filter that makes use of the input parameter!
Channel should be equals to the Input Parameter in this case, so:
It seems like we’ve made all the necessary changes to our model, and it’s looking good so far. Now, we can go ahead and save and deploy the graphical view.
Before we do that, it’s always a good idea to double-check everything to ensure that we haven’t missed anything important. Once we’re confident that everything is in order, we can proceed with saving the model and deploying it.
Great job so far!
2.2. Checking Impact and Lineage for Graphical View
Additionally, let me just call you attention for “Impact and Lineage Analysis”.
Have you heard about Impact and Lineage Analysis? It’s a really useful diagram that can help you understand the lineage (or data provenance) of a specific object, as well as its impacts – the objects that depend on it and that will be affected by any changes made to it.
When you’re in Dependency Analysis mode, you’ll also see objects that are connected to the analyzed object by associations and data access controls.
This feature can be really helpful in understanding the relationships between different objects and can be a valuable tool in ensuring the accuracy and consistency of your data. So, give it a try and see the benefits for yourself!
2.3. Creating Analytic Model in SAP Datasphere
Now that we have created our analytical dataset, the next step is to create an Analytic model. Think of Analytic models as the foundation for making data ready for consumption in SAP Analytics Cloud. They enable you to create and define multi-dimensional models that provide data for analytical purposes, helping to answer different business questions.
Just drag and drop our view (just created). Note that when we set our analytical dataset, an option for creating the Analytic Model directly appeared, we are bringing here just another option to create it.
On this screen, we can choose all the measures and attributes we want to include in our Analytic model, just as we can for associations. Remember, we had created an input parameter earlier, so we will have to map it later within our Analytic model.
Now that we have selected our measures and attributes, let’s continue building our Analytic model. And that’s what we’ve got – a powerful tool that will help us to make data-driven decisions and drive business success. You may save it, deploy it and preview it.
Now it’s up to you (be creative):
We’ve made amazing progress in the field of data analytics, and as a result, we can now accomplish a variety of powerful analytics tasks. But with SAP Datasphere, we can take things to the next level by leveraging its enterprise-wide data catalog to ensure proper governance over multiple systems.
This is particularly important when it comes to creating Analytic Models, as the catalog helps us manage and maintain metadata for the data assets used to build these models. By classifying, labeling, and documenting our data, we can ensure that our insights are accurate and trustworthy. This means that utilizing the SAP Datasphere catalog is not just a smart choice – it’s a necessary one. By providing important context for the assets we create, the catalog enables us to take full advantage of the insights that data analytics can provide.
So why wait? Let’s dive in and start exploring the full potential of SAP Datasphere Catalog!
The Catalog is a powerful tool that provides an effective data governance strategy by bringing together an organized inventory of business metadata and data assets. This enables both business and technical users to unlock the full potential of their enterprise data. As a central location to discover, classify, understand, and prepare all the data in your enterprise, the catalog provides a comprehensive solution to manage your data assets.
Using the catalog, you can find the single source of truth for your data domain and build reusable business models. With the ability to discover which stories are impacted by your changes, the catalog helps you to streamline your workflow and ensure data accuracy.
When you’re searching for data assets within the catalog, you’ll typically start by browsing the catalog home page. From there, you can open an individual asset to see more detailed information about it. The details page for a catalog asset is a rich source of information, presenting a range of metadata extracted from the underlying source system, as well as additional information added by catalog administrators. This may include descriptions, glossary terms, tags, key performance indicators (KPIs), and more.
Note that you’ll need the role “Catalog Administrator” and “Catalog User” to perform the next steps, ok? Make sure you got them previously. If you get all the roles, you should see something like this:
3.1. Synchronizing and Setting up Systems for the Catalog
The catalog automatically extracts and keeps in sync new or updated metadata from your system. Only manually resynchronize under certain conditions:
- After adding a new system that has existing assets.
- After deleting and later restoring a system connection to capture object changes that occurred while the connection was deleted.
- If the error logs show missing metadata that failed to extract automatically.
When you synchronize “SAP Data Warehouse Cloud”, it may take some time to complete. You can continue working on other tasks while the synchronization is in progress, and you will be notified when it is finished. If you want to monitor a system, you will need to create a connection to it first (for DWC source it should be created automatically, but if it doesn’t, go ahead and create it yourself).
And now you should be able to see all available assets within the system:
3.2. Meeting the Asset Overview Page
In our case, we’re interested in “Demand*” assets. Thus, we can just filter by “demand”. Simple, hun?
Choosing the “Demand Model” – the Analytic Model that we created before – you’ll see the overview page. It gives you a detailed overview of an asset and its glossary terms, tags, and KPIs. The tab has three sections: Overview, Asset Overview Details, and Asset Terms, Tags, and KPIs.
The Properties section shows read-only asset property information extracted from the data source, including the asset name, creation and change dates, source business name, semantic usage, and whether it’s exposed for consumption and the Descriptions section shows both the source description and catalog description of the asset. The Asset Terms, Tags, and KPIs section displays the glossary terms, tags, and KPIs linked to the asset, which only catalog administrators can link.
3.3. Creating Tag and Tag Hierarchies
To better organize your assets in the catalog, you can create tags and tag hierarchies. To do this, you need to have the Catalog Administrator role.
Once you have created the tag hierarchies, you can create and add tags to your assets, which will make it easier to search for them. However, if a hierarchy has the Multiselect option disabled, you can only add one tag from that hierarchy to an asset.
Select (check box) then click on “Save” and that’s it.
3.4. Analyzing Lineage and Impact of your Assets
One particularly useful feature of the Catalog is the lineage diagram, which shows you how the asset is related to other assets within the catalog. By reviewing the lineage diagram, you can gain insight into how the asset fits into your larger data ecosystem, and which assets are impacted by changes to this asset.
You can also check the impact chain by drilling down on the assets which are dependency for the Analytic Model.
3.5. Creating Glossary and Terms
We still have many capabilities to see, but first, we will create a business glossary which provides a central and shared repository for defining terms and describing how and where they are used in the business.
Now we have a glossary, let’s create a term. Terms are contained in a business glossary and provide meaning to your data.
You’ll need to choose the glossary and, additionally, if you want, you may create a Template to speed up your next developments in Catalog.
It’s a straightforward tip! However, it’s essential to provide detailed information about your assets to help others find what they’re looking for quickly and easily. By doing so, you can optimize the search process and enhance user experience. So, remember it when you are using the Catalog.
The next step is to assign this term to the Asset by clicking on “Relationships” and “Manage Relationships”:
Select your asset and then click on “Add to Related Objects”.
3.6. Creating KPIs
Good! Let’s create a KPI. They will help business to measure the progress towards a result such as a goal or objective. It allows for performance tracking and provide an analytical basis for decision-making as well.
To create a KPI that’s actually helpful, start by setting a goal that can be measured quantitatively or qualitatively. The interface is easy to be understood, but please consider the relationship to other assets do not perform any calculations on those assets.
Here’s an example: let’s say you want to improve the accuracy of your demand forecasting. You can create a KPI for this and link it to all the source assets that contribute to providing this kind of information.
When defining the KPI, you can specify the calculation as “some formula”, and then create detailed documentation that includes the steps for finding the source systems and creating a workflow to gather the data needed for tracking each month. It’s important to note that you’ll need to perform these steps separately from the KPI, but the KPI will provide all the information you need to track your progress towards your goal.
By following these steps, you’ll be able to create a KPI that’s truly helpful and effective in helping you work towards your objectives.
3.7. Determining Visibility for Catalog Artifacts
Finally, let’s talk about governance: as a catalog administrator, it’s your responsibility to determine which content is discoverable and visible to users who are searching in the catalog. Once you’ve enriched asset metadata, applied appropriate classifications, and ensured the overall quality of the asset, it’s time to publish it!
When you publish an asset, you’re making it available for everyone to discover. This includes associated terms and KPIs that help users understand the asset and its performance.
By publishing your assets, you’re opening the door (or “eyes”) for others to benefit from your hard work and insights. So go ahead and hit that publish button once you’ve ensured that your asset is of high quality and provides value to your organization!
Thanks for taking the time to read this article on SAP Datasphere. I am excited to have explored three of the innovative features that this technology offers – Application and Data Integration, Semantic Modeling, and Data Cataloging and Quality.
I believe that these features can be developed and used in a business context to unlock the true power of data. I also want to note that as SAP continues to innovate and improve their products, some of the functionalities we discussed may change or look different in the future.
So, I may conclude that SAP Datasphere is an exciting innovation that can help businesses make sense of their data. With its ability to integrate, catalog, model, and virtualize data from various sources, SAP Datasphere promises to be a game-changer.
Hopefully, this article has provided you a better understanding of these key features and how they can help businesses unleash the full potential of their data.
Thanks for reading!
Excellent!! Thanks for sharing the new features and capabilities. Waiting for the next ones.
HI Carlos Basto
Thank you for sharing. Could you please tell me what is the technology behind the Replication Flows? Is it the DP-Server/DP-Agent or the DI/Data Flow Adapter or something completely new? Is this documentation still applicable or is there something new?
Hi Sebastian Gesiarz
Thank you for your question.
I would like to provide some additional information regarding the Replication Flows technology you mentioned. This cloud-based data replication tool is designed to simplify data integration processes by eliminating the need for additional on-premise components. This means that it does not rely on DP-Server/DP-Agent technology but instead uses the Data Intelligence Embedded environment and Data Intelligence Connectors to connect to remote sources.
The documentation you provided is up-to-date and applicable for Remote Tables that use SAP HANA Federation based on Smart Data Integration (SDI) or Fabric Virtual Tables. It is important to note that these adapters will continue to be available in SAP Datasphere, and there are currently no plans to remove them. However, it is always recommended to stay up-to-date with the latest SAP Datasphere releases and updates to ensure that you have access to the latest features and functionality.
I hope this information is helpful. Let me know if you have any further questions.
What a end to end wonderful blog, Thanks a lot Carlos, 😊
Is SAP planning to provide DATASPHERE in BTP Trial rather free Tier,As free Tier is not available in most of the countries and it's getting hard to learn.
Can you also share some blogs where i can use external API's like northwind and persist the data of external API's in Datasphere?
Hi Ahmed Ali Khan
Regarding your question, you can actually get the free trial of SAP Datasphere by visiting this link: SAP Datasphere trial.
In terms of integrating external APIs with SAP Datasphere, I'm not aware of any specific blog posts on that topic. However, as far as I know, SAP Datasphere does support:
if any colleagues have additional insights or resources on this topic, I encourage them to share their comments as well.
I hope that helps!
No DataSphere entitlements in Trial accounts and no such service available in Trial account.
Nice blog Carlos👍 and thanks for sharing the detailed information about SAP datasphere.
SAP Datasphere screens and options looks similar to that of SAP DWC (Data Warehouse Cloud). Do you know what is the difference between these two offerings?
Hi Avinash Neeli,
SAP Datasphere is in fact the successor to DWC.