To obtain from facts to ideas and decisions, start with the data. Above shows the 3 step process, where a user connects to data, prepares or cleanses it and finally explores, analyzes or uses machine learning to create predictions. This process needs to be as frictionless and smooth as possible. To be fast and efficient, "users need maximum agility to go back and forth freely between these stages, particularly in a self-service scenario". (Source: SAP)
SAP has some limits in terms of "how flexible and how quick users can bring in data into" SAP Analytics Cloud and how quickly to go from data to analysis. SAP has a "strong structure and model first approach" based on the financial planning background of the solution, thus this makes some of the BI and data prep workflows more rigid than necessary. SAP knows that "that agility and flexibility are key ingredients for a self-service business user scenario" and that is why we work with high priority on those topics. Source: SAP
To prepare with agility, users want to be able to combine data from multiple sources, no matter if it Is an SAP source or 3rd party, no matter whether it sits in the cloud or on-premise and regardless of whether it is acquired or remote. A business user does not want to understand the underlying details, of where the data comes from and what type of source this is. They also want to be able to discover, prepare and share that data in a self-service fashion without any or minimal involvement from IT. So IT may have setup the connection especially to some of the central or governed systems, but then the business user wants to leverage in that system freely, based on their privileges. The preparation itself needs to be easy and be done in a visually understandable way for our business users and customers. And clearly, this needs to be a data first approach, so the shape of the data defines the semantic structure and it needs to flexibly adjust to it. Source: SAP
A dataset is a reusable entity that represents the source data and any transformation or combination of multiple sources afterwards. It can be enriched with some lightweight semantic information on top and allows immediate transition between data-prep workflows and visualizations, without a need to explicitly specify a model or transform the underlying data structure into a relational cube structure. And since it is a first class object inside SAC, it can be re-used, shared and you can define security on top of it. (Source: SAP)
Decouple the data itself from the semantic on top allows the data driven flexibility and reuse on the data side, but also a gradual enrichment of those data sets with increasingly relevant semantical information. So by being able to combine, enrich and publish them to a larger audience we can achieve a gradual transition from the agile data preparation to governed semantic models. A dataset can be the starting point for a shared dimension, it can be used for currency or POI information or one or multiple can be the basis for a well-defined model, that will then be used to feed more data into it from other sources and other datasets. At the same time, those datasets can be the foundation of multiple models and are not bound 1:1 to a single model. (Source: SAP)
How is this dataset different than the dataset concept available in Smart Predict?
Is this like IDT except in the cloud?
Will this be like Trifacta, the data wrongling tool?
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
7 | |
5 | |
5 | |
5 | |
5 | |
4 | |
4 | |
4 | |
3 | |
3 |