Predictive Analytics in SAP Business Objects Cloud-Smart Discovery
SAP has provided a “Smart Discovery” functionality in Business Objects Cloud targeting business users to help them get insights from data in the form of easy to understand charts and statements. To derive these insights BoBJ Cloud chooses, in the background, the predictive algorithm best fit for the model and data .Since the algorithm selection and execution is happening in the background any business user can derive these insights in a matter of minutes and without any knowledge of data science or complex predictive algorithms.
I decided to try “Smart Discovery” by building something in the system. I always find it better if I can build something and then play around with it to try out the various functionalities .It also helps me better retain the concept. So like my previous blogs I will share the data and the steps that I used so that you can execute the steps yourself. You can download the data from https://drive.google.com/open?id=0B0ETLJHyKCEFbEZIVFhfa1Q3dEk
The dataset you downloaded has the sales figures of retail products which have historically seen some changes in sales with changes in weather conditions. These could be products like umbrellas, raincoats, rain boots, sun glasses, thermal wear, flip flops etc., for eight days for a set of stores in different cities .We also have the weather data i.e. temperature, air quality and humidity for each of the cities for the eight days .
What insights are we looking for?
We want to investigate how the change in weather conditions has an impact on the sales of these products. For e.g. how hot does it have to get for people to start buying sun glasses? How many sun glasses can I expect to sell based on which I can decide on how many to stock?
If we tried to do this on our own by identifying an algorithms, loading and modelling the data this would take a considerable amount of time and require the involvement of data scientists, data engineers and so on. With BoBJ Cloud we can find these insights in under 5 minutes.
Lets create the Model by Importing the Data
We will import the data and let the system create the best most optimum model for us.Creating the model for this case is very straightforward. Use “Import a file from your computer”
Click on “Select Source File” in Blue and select the file from your desktop. BoBJ Cloud does the remaining for you like identifying what kind of file it is..
Read this “Data Sample” for information and press “OK”
After verifying that you got your dimensions, measures and date fields as expected click on “Create Model” and in about 30 seconds your model is ready.
Creating the story
Next “Create story” –> Add a Canvas Page–>Table and select the model you just created as data source
In the story that opens up select a few dimensions in the row and save the story. Then click on “Data View”
Now let’s do some “Smart Discovery”
On the next page select “New Smart Discovery”
On the next page double click on “Sales”. The field that you click first becomes the “Target Column”. Exclude Date, City and Shop as we only want to see the impact of change of weather on sales of a product.
To help better understand “Guided Machine Discovery” let’s take only a single product “P0001”.
Once you have executed the steps your page will look like this.
Click “Run”. After that it may take a few seconds for the output to come up.
First thing you will see in the output right on top is the metadata about the run.
An “insight quality” of 5/5 is excellent though with a large dataset we may not be able to achieve 5/5.The good thing with GMD is that it will tell you if it thinks the data quality is not good enough and then you will have the option to continue with the low insight quality or add more fields to improve quality.
The picture above is for illustration purpose only. If you have followed the steps in the blog you will not get this message.
The algorithm identify and then executed at the backend was able to identify only two key influencers impacting sales of P0001 i.e. temperature and Air Quality with temperature being the main influencer.
On clicking on any of the key influencers you can see the relation of the key influencer with the target column in a horizontal bar chart. You can flip between “Averages” and “Count” .And more importantly you can see the derived insights to help you make better decisions.
The second insight i.e. “100.00% of Records with Temperature (109.00-119.00) have above average sales” tells me that most sales for this product happens when the temperatures are high.
The reducing sales in the bar chart with the falling temperature help me infer the same again.
Now I want to look into how both Temperature and Air Quality together impact sales. For this select “Air Quality” in the first horizontal bar chart under “Key Influencers”. Now when you scroll down you will find a heat map showing the relation between the three with insights on the right.
The consistent dark color areas at high temperature in the heat map help us infer that Air quality has very less to no impact on the sales of this product.
As seen in the metadata there are two unexpected values and we would like to look into these records and analyze them. Click in the show unexpected values.
And the definition of unexpected values that I found in BoBJ Cloud is crisp and to the point so I will just use that.
The table has the two records and the details
And the line and the bar charts below the table help us understand better
So if we look into the first record of the table the algorithm was expecting a sales of 14.55 but the sales was 13% higher at 16.77 so the system decided that these values were not normal or outliers.
Simulation which you will find right at the bottom is also the most interesting.
As you change the temperature and/or air quality you can see how much sales you can expect. If there were some dimensions in our analysis you will get radio button drop downs for the same.
In our case, as a test, if we increase air quality the sales should not increase by much which we can see is true.
You can select any chart or insight/table and copy it to your story. In case you ass more data and rerun the discovery then you will have to delete the old graph from the story and paste the new one. The old graphs do not change to accommodate the old discovery.