Personal Insights
Data Science & SAP Analytics Cloud
This blog is in continuation to my earlier blog on Data Science https://blogs.sap.com/2020/07/07/saps4hanaanddatascience/.
SAP Analytics cloud (SAC) is Software as a Service (SaaS) platform which is used for providing analytical capabilities to all users in one product. SAC has capabilities in areas of analysing, planning, predicting and reporting all in one place to reduce time and save effort.
My blog is related to how predictive analysis is embedded in SAC to simplify business understanding and how business can extract powerful information to take any decision related to future planning.
The whole idea of this blog is to help Business Users understand the use of predictive analysis and how SAC can simplify their life. I do not intend to use mathematical analysis or complex logic of inferential statistics or advance hypothesis. I strongly believe in Business first policy and IT is an enabler in making Business life easy. So my blog focusses on the usage of statistics without getting into how the formula works or is derived.
Source: SAP
Smart Predict
To help Business Users SAP have integrated automated predictive feature into SAC. Mostly in all Business there is a requirement of 3 main predictive techniques. Hence SAP has given the functionalities that can cater to most popular demands of Business.
The predictive techniques are
 Classification: Simple definition is data mining functions that derive target categories or classes.

 Credit card application – Which group of people may be found to be fraudulent?
 Energy and resource Which area or age group will have demand for roof top solar panels?
2. Regression: Simple definition is relation between certain variables to give a cause & effect relationship.

 Retail example: What influence sells? Which factor has cause effect relationship on my revenue?
 Supply Chain: Price of Raw Material has a direct relationship with Oil Price. Which other factor has a cause and effect relationship on logistic?
3. Time Series: Simple definition collection of data over a specific period and predict behaviour over a span of time.

 Plant or Factory: When we can have a machinery breakdown in future?
 Retail: Sale of gold will increase during a certain festive season. When it is right time for my business to enter the market?
A brief overview of data sources
We can leverage various data sources, some of them being:
 Excel or csv.
 Generic OData sources.
 SAP Applications
 SAP S/4 HANA sources CDS views
 SQL Databases.
In most of the cases your ABAP or UI5 consultant can help you by providing the CDS views if you have S/4 HANA. For excel and csv it is a simple upload.
Out of the threepredictive technique, I shall try to explain some interesting case studies on Regression Model. It’s easy and simple and can give you some of the best predictions to quantify the cause effect relationship.
Simple Linear Regression:
Ice Cream Sale: I have just plot a graph in excel and you can create some value and a similar graph for your understanding. SAC will be used for multiple regression when we have more than one factor.
Temperature (C)  No of Ice cream sold 
25  130 
26  135 
30  150 
37  200 
38  210 
32  155 
20  120 
So, If I want to predict the sales of ice cream when temperature is 40’C I shall use the simple linear equation
Y=mx+c=5.1054*40+5.4384=209.54=210 Cups of Ice cream.
R^{2}: As of now let us just understand higher the R^{2} better is the model. In our simple case it is ideal.
Multiple Regression
If I want to consider some more factor which might be affecting my ice cream sell. This will give me a direction what should be my strategy of selling ice creams.
So now my formula would be: y= c+ m_{1}x_{1}+m_{2}x_{2}+m_{3}x_{3}……m_{n}x_{n. }
To keep it simple I am considering the effect of rainfall on the sale of ice cream. Then we may define and find out how we may run a regression model in SAC.
Temperature  Rainfall (MM)  Revenue 
24.56688442  0  534.799 
26.00519115  0  625.1901 
27.79055388  0  660.6323 
20.59533505  0  487.707 
11.50349764  20  31.2402 
14.35251388  15  36.9407 
13.70777988  17  30.8945 
30.83398474  0  696.7166 
0.976869989  0  55.39034 
31.66946458  0  737.8008 
11.45525338  10  85.39 
3.664669577  15  71.16015 
18.81182403  0  467.4467 
13.62450892  0  289.5409 
39.53990899  0  905.4776 
18.48314099  0  469.909 
25.93537514  0  648.21 
42.51528041  0  921.5083 
Steps in SAC
 We go to Menu
 Select predictive scenario.
 Select Regression
4. We create a prediction model for IceCream and we train our model on our source data.The training dataset observation are the foundation of our predictive model.
5. We get certain values which are explained below in a simplified manner considering the scope of this blog
Root Mean Square Error (RMSE): Measures avg. difference values predicted by my model vs. the actual value.
Prediction Confidence: Measure of the accuracy of predictive model. For reference, 95% or above is considered a very good score and 8595 is still considered good.
Descriptive Statistics
Mean: Average of dataset.
Std. Deviation: Dispersion of data set.
6. The Influencer: the picture describes it all. It is the relative importance of each variables used in predictive model. In our case it is rain and temperature
7. The most crucial information
Let’s keep this simple and please note the whole idea that SAP is trying to make is make it easy for all.
 ValidationActual: actual target value as a function of prediction. (y= c+ m_{1}x_{1}+m_{2}x_{2}+m_{3}x_{3}……m_{n}x_{n)}
 Perfect Model: All prediction is equal to actual values
 Validation Error min & max: deviation of my current predictive model.
Conclusion
 Validation & Perfect Model: Matches hence the predictive model is accurate
Hence, we can conclude that Rainfall and Temperature are two strong influencers in predicting the sales of ice cream. Although generally temperature seems to bigger contributor, but Rain has a higher value and that the beauty of this model.
Other simple case studies which could be considered for multiple regressions are
 Trend of Employee Performance and multiple factors which influence the trend.
 Key factors to control the cost of production.
 Trend of sale of a product with certain influencing factors like location, price, promotion etc.
My next blog will be on how we can use Inferential Statistics in a very simple and easy way using R in SAC.
Excellent blog
Thanks Anirban.
This is an excellent blog that i have come across after a long time. Thanks for putting this together so nicely.
Thank You Sudoti.
Great insight to traditional predictive analysis and how it is represented in SAC
Thank You Sir. Grateful
Fantabulous..
very well explained in easy to understand way.
Thanks a lot Upendra Ji.
Well documented approach for data science. Good read.