Skip to Content
Product Information
Author's profile photo David Stocker

Next gen Data Wrangling and Agile Analytics

Its mid afternoon and you’ve just gotten an email, asking how X affects sales by region, the answer is needed by the end of the day and your data source looks like:

 

This is a common problem that analysts and data scientists face. The data that they need to use is there, but they can’t use it as they need it right away. It needs to be munged and wrangled first, and under time pressure. I recently came across a twitter thread, where the participants were using all of the usual suspects among the analytics tools on the market; and this pain point was acute.  There is a hole in the Analytics content creators’ task space.

 

We recognized the problem and the SAP Analytics Cloud development team has been working on a solution. Next month, SAP Analytics Cloud customers will start seeing something new in their preview tenants, something which will also show up on quarterly release tenants in late summer. What they will see is a top to bottom overhaul of data wrangling and dataset management; a new experience for handling acquired data. Our goal is for data analysts to be able act quickly, confidently and iteratively.

Gone are the days of having to build a model, before you can even get to the tasks of analyzing and telling a story. If you don’t need the features of a full, structure-first model, then you can skip creating one. To accomplish this, we’ve promoted datasets to the level of first-class citizen, alongside models. You might know datasets as something that let you upload an immutable table, set in stone. It was easy to work with, but of limited value. Now, datasets can be modified and updated, so you can use them as a lightweight alternative to models.

Also gone are the days of having to leap, before looking. You’ve probably wrangled some data and started building a story. You notice that you needed to tweak your wrangling, only to find that you can’t change your wrangling and have to start over. We know this, because you’ve told us! With the new workflow, you’ll be able to toggle back and forth between storytelling and wrangling, tweaking your wrangling as needed.

You’ve also told us that you needed more power, when it comes to wringing more out of the data that was already there. We’ve introduced a new expression editor, for building complex, custom transformations. It includes operators for all of the typical string, number, date, and geo manipulation operations that you’d expect, so that you can do complex, programmatic transformations on your data.

Some customers have already been using this workflow, in a closed beta. We’re exited to finally be able to release it to the full customer base! When it lands on your tenant, be it a preview tenant with wave 2020.12, or on a production tenant, with 2020.Q3, take it for a spin! We think you’ll like what you see! And by all means, let us know what you think!

Assigned Tags

      15 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Rene Malmberg
      Rene Malmberg

      If it works – thank you!

      and after some further elaboration from SAP – this new thing does not help a bit.

      datasets are not schedule able so bringing new data in – not possible, so why bother using it?

      and you are not able to append or merge with other data sources

      so back to the poorly build models – it’s the only thing we got that is not user friendly 

      so please SAP – models… open them up and do not destroy our work when we rebuild them

      Author's profile photo Antoine CHABERT
      Antoine CHABERT

      I had the chance to have a sneak preview of the workflows - believe me it's nice and also benefits Smart Predict greatly!

      Author's profile photo Rene Malmberg
      Rene Malmberg

      workflows - i do not see any workflows. Just an upload of data - and some basic transformations. And still no possibility to change the available dimensions (new ones / remove dimensions) in the data refresh phase (and that is where you would have it)

      What do you mean by workflow?

      Author's profile photo Sainath Kumar
      Sainath Kumar

      Excellent functionality.
      For self service analytics and will certainly help citizen data scientists

       

      looking forward to Q3 release !

      Author's profile photo Juan Carlos Lazaro
      Juan Carlos Lazaro

      David,

      is this just for flat files upload or will it work with every connection available in SAC?

      Nice stuff, by the way.

      Thanks

      Author's profile photo David Stocker
      David Stocker
      Blog Post Author

      It will work for every import connection.

      Author's profile photo Rene Malmberg
      Rene Malmberg

      Are we able to schedule a dataset? So refresh the data as we know from models

      as I understand this new wrangling is only for datasets- correct?

      Are we able to append and merge with other data sources  as we can in the modeling phase? Or is this single source ?

      what about the model ? Is that the same as we know now?

      Author's profile photo Paul Ekeland
      Paul Ekeland

      Good questions Rene, and most of them are functionalities that we are planning for shortly after the inittial release:

      • With QRC.Q3, users will be able to work on a single source, and we are already working on adding the possibility to create a Dataset from multiple sources.
      • Scheduling of Datasets also is part of the upcoming features we have
      • In the meantime, model capabilities remain untouched for those more governed workflows.

      Thanks,

      Paul

      Author's profile photo Rene Malmberg
      Rene Malmberg

      Hi thanks for the answers

      then I can conclude that nothing has really changed for this release

      unscheduled and un-appended datasets does not help anything - the are just baby steps towards something we all have demanded for the last 3 years

      so when are the models ready for us to use? As the current setup is almost unacceptable to work with

      and you know this - you heard us

      Author's profile photo Mateusz Mikulski
      Mateusz Mikulski

      David Stocker I find it difficult to understand what are the new functionalities you described… Is it: 1. Possibility to update and modify existing Datasets. 2. Formula based transformations for models. 3, Workflow (but… for what)? Do I get correctly?

      Author's profile photo David Stocker
      David Stocker
      Blog Post Author

      1 and 2 are correct.

       

      Workflow -

      If you create a story by starting with data, you have initial wrangling and then you work in the story.  In the classic wrangler, if you find that you want to modify the wrangling, you can't.  You have to start over.  With the new wrangling engine, you can go back at any time, modify the wrangling graph and return to continue editing your story.

       

      You'll be able to make datasets public.  Public datasets can be fed to models.

      Author's profile photo Jef Baeyens
      Jef Baeyens

      Hi David,

      Can (multiple) Public datasets be fed to models in Q3 release?
      Can they also be fed to dimensions in Q3 release?
      Can we schedule dataset refreshes (from any source system) in Q3 release?

      If all 3 is true, this will be helpful for (planning) modellers as well right?

      Kr,

      Jef

      Author's profile photo Rene Malmberg
      Rene Malmberg

      Hi David

      I just tried the the 2020.12 version. If you – like many of us have tried with models – have forgotten a dimension. You HAVE to start all over in the “new setup” – yes? So no change

      So your ad hoc scenario – should rather be named in SAC terminology.. One off analysis

      If I had to to a ad hoc report – I would use excel or as in my case I use PowerBI/Excel for ETL and data cleansing. These tools are more flexible – yes you can add a dimension in these tools and even remove one without destroying the hole model and subsequently all reporting associated with that model you would have destroyed. And in PowerBI I can change a transformation step – in any step, without having to delete everything until the step I want to change.

      What I use SAC for is it’s story capabilities – on top of a MODEL (it can be scheduled)

      But if I have to change something in that model – new dimension or measure, that is where I have to set aside the whole day and the next. And this new concept does not help me what so ever

      Yes there are some nice features – but it does not help us one bit for two reasons

      1. Cannot be scheduled
      2. its not a model
        1. and now you cannot convert it into a model – that feature has disappeared
        2. The measure creator is not there – and that is where SAC is much better than PowerBI
          1. Do not have this in datasets
        3. Create custom properties of a dimension – create in excel, past in SAC – beautiful
          1. do not have this in datasets

      So I do not understand this unnecessary step – why not go directly to the models – that is what we use?

      and let the predictive guys use a model as a data source for the predictive stuff

       

      Author's profile photo Keith Fisher
      Keith Fisher

      this is just what has been needed for a long time! great news.

      Now all we need is a way of skipping the modelling step entirely for universe related queries and I'll be the happiest data-bunny ever!

      Author's profile photo Lee Hsu
      Lee Hsu

      Hello,

      sounds great!

      But will this modification have any impact on existing models which are already in use for SAC stories? We use BW Live Connection for most of our stories but also some combined with aquired data.

      Regards