Product Information
Next gen Data Wrangling and Agile Analytics
Its mid afternoon and you’ve just gotten an email, asking how X affects sales by region, the answer is needed by the end of the day and your data source looks like:
This is a common problem that analysts and data scientists face. The data that they need to use is there, but they can’t use it as they need it right away. It needs to be munged and wrangled first, and under time pressure. I recently came across a twitter thread, where the participants were using all of the usual suspects among the analytics tools on the market; and this pain point was acute. There is a hole in the Analytics content creators’ task space.
We recognized the problem and the SAP Analytics Cloud development team has been working on a solution. Next month, SAP Analytics Cloud customers will start seeing something new in their preview tenants, something which will also show up on quarterly release tenants in late summer. What they will see is a top to bottom overhaul of data wrangling and dataset management; a new experience for handling acquired data. Our goal is for data analysts to be able act quickly, confidently and iteratively.
Gone are the days of having to build a model, before you can even get to the tasks of analyzing and telling a story. If you don’t need the features of a full, structure-first model, then you can skip creating one. To accomplish this, we’ve promoted datasets to the level of first-class citizen, alongside models. You might know datasets as something that let you upload an immutable table, set in stone. It was easy to work with, but of limited value. Now, datasets can be modified and updated, so you can use them as a lightweight alternative to models.
Also gone are the days of having to leap, before looking. You’ve probably wrangled some data and started building a story. You notice that you needed to tweak your wrangling, only to find that you can’t change your wrangling and have to start over. We know this, because you’ve told us! With the new workflow, you’ll be able to toggle back and forth between storytelling and wrangling, tweaking your wrangling as needed.
You’ve also told us that you needed more power, when it comes to wringing more out of the data that was already there. We’ve introduced a new expression editor, for building complex, custom transformations. It includes operators for all of the typical string, number, date, and geo manipulation operations that you’d expect, so that you can do complex, programmatic transformations on your data.
Some customers have already been using this workflow, in a closed beta. We’re exited to finally be able to release it to the full customer base! When it lands on your tenant, be it a preview tenant with wave 2020.12, or on a production tenant, with 2020.Q3, take it for a spin! We think you’ll like what you see! And by all means, let us know what you think!
If it works – thank you!
and after some further elaboration from SAP – this new thing does not help a bit.
datasets are not schedule able so bringing new data in – not possible, so why bother using it?
and you are not able to append or merge with other data sources
so back to the poorly build models – it’s the only thing we got that is not user friendly
so please SAP – models… open them up and do not destroy our work when we rebuild them
I had the chance to have a sneak preview of the workflows - believe me it's nice and also benefits Smart Predict greatly!
workflows - i do not see any workflows. Just an upload of data - and some basic transformations. And still no possibility to change the available dimensions (new ones / remove dimensions) in the data refresh phase (and that is where you would have it)
What do you mean by workflow?
Excellent functionality.
For self service analytics and will certainly help citizen data scientists
looking forward to Q3 release !
David,
is this just for flat files upload or will it work with every connection available in SAC?
Nice stuff, by the way.
Thanks
It will work for every import connection.
Are we able to schedule a dataset? So refresh the data as we know from models
as I understand this new wrangling is only for datasets- correct?
Are we able to append and merge with other data sources as we can in the modeling phase? Or is this single source ?
what about the model ? Is that the same as we know now?
Good questions Rene, and most of them are functionalities that we are planning for shortly after the inittial release:
Thanks,
Paul
Hi thanks for the answers
then I can conclude that nothing has really changed for this release
unscheduled and un-appended datasets does not help anything - the are just baby steps towards something we all have demanded for the last 3 years
so when are the models ready for us to use? As the current setup is almost unacceptable to work with
and you know this - you heard us
David Stocker I find it difficult to understand what are the new functionalities you described… Is it: 1. Possibility to update and modify existing Datasets. 2. Formula based transformations for models. 3, Workflow (but… for what)? Do I get correctly?
1 and 2 are correct.
Workflow -
If you create a story by starting with data, you have initial wrangling and then you work in the story. In the classic wrangler, if you find that you want to modify the wrangling, you can't. You have to start over. With the new wrangling engine, you can go back at any time, modify the wrangling graph and return to continue editing your story.
You'll be able to make datasets public. Public datasets can be fed to models.
Hi David,
Can (multiple) Public datasets be fed to models in Q3 release?
Can they also be fed to dimensions in Q3 release?
Can we schedule dataset refreshes (from any source system) in Q3 release?
If all 3 is true, this will be helpful for (planning) modellers as well right?
Kr,
Jef
Hi David
I just tried the the 2020.12 version. If you – like many of us have tried with models – have forgotten a dimension. You HAVE to start all over in the “new setup” – yes? So no change
So your ad hoc scenario – should rather be named in SAC terminology.. One off analysis
If I had to to a ad hoc report – I would use excel or as in my case I use PowerBI/Excel for ETL and data cleansing. These tools are more flexible – yes you can add a dimension in these tools and even remove one without destroying the hole model and subsequently all reporting associated with that model you would have destroyed. And in PowerBI I can change a transformation step – in any step, without having to delete everything until the step I want to change.
What I use SAC for is it’s story capabilities – on top of a MODEL (it can be scheduled)
But if I have to change something in that model – new dimension or measure, that is where I have to set aside the whole day and the next. And this new concept does not help me what so ever
Yes there are some nice features – but it does not help us one bit for two reasons
So I do not understand this unnecessary step – why not go directly to the models – that is what we use?
and let the predictive guys use a model as a data source for the predictive stuff
this is just what has been needed for a long time! great news.
Now all we need is a way of skipping the modelling step entirely for universe related queries and I'll be the happiest data-bunny ever!
Hello,
sounds great!
But will this modification have any impact on existing models which are already in use for SAC stories? We use BW Live Connection for most of our stories but also some combined with aquired data.
Regards