Introduction to Smart Data Wrangling
With the smart wrangling capabilities in SAP Analytics Cloud, you can get immersed in gaining insights from your data through stories and analysis, without worrying about getting the data prepared perfectly from the start.
In this blogpost I will show the new Custom Expression Editor and how it works. All features shown in this blog are available with the QRC.Q3. My dataset is based on sales data. They contain information about products, sales representatives, cities, date, customer satisfaction, etc. I would like to perform three modeling steps on the dataset:
- Change data in one column based on the value of a second column
- Split string and keep only a part
- Perform calculations across multiple columns and create a new column
But first I would like to talk about the new design of the Data Wrangler.
The Details panel on the right side has been restructured. Here you can now switch columns between measures and dimensions simply by dragging and dropping. A particular added value is offered here if you switch from the “Dataset Overview” to the “Details” view.
To start with our tasks, we open the Expression Editor.
The help panel on the right-hand side will also open immediately. Here, especially new users will find a good introduction to the functions.
All functions are categorized. In the five categories you will find a detailed description of the syntax and functionality of each function, as well as examples that you can use directly.
Our first task is to modify data in one column based on another column. In our example this means that we get a new employee for the city of Liverpool and have to enter “John Doe” as manager for the column. For this we select the if-function.
As you can see, the system directly gives the correct syntax and we can enter our modification.
The command is to be read as follows: We assign a new value to the “Manager” column using the If query. If the city equals Liverpool, we insert “John Doe”. If the city is not Liverpool, we keep the old value.
By confirming the command, the expression is applied to the “Manager” column and the new values are inserted.
In our second tasks we use the split function. In our example, the “Category” column contains product descriptions. However, some columns contain several descriptions separated by a comma. We only want to keep the first description of an enumeration.
For this we define the comma as separator and the number of elements taken over – one, the first.
Calculation of a new column
The third task is to calculate a bonus for our managers. This is calculated from the columns “Sale Revenue” and “Customer Statisfaction”. If the sales revenue is greater than 20000 and customer satisfaction is equal to or greater than 90%, a bonus of 5000 should be paid. This payment should be stored in the new column “Bonus”.
As you can see, you do not need to create new columns separately. They can be defined using the editor and are created when the statement is executed. Another special feature is that we use the and-connector in the if-statement. With linking connectors, calculations can be performed across multiple columns.
After executing the expression, you can see that the dataset is extended by a new column.
The Custom Expression Editor is just one of the features introduced with the Smart Wrangler features. Together with the other features, Smart Wrangling offers extensive modeling possibilities to flexibly respond to the circumstances of data analysis.
blog and this Guided Playlist. An overview of all features and enhancements of the QRCQ3 release can be found in the Product Updates.