Skip to Content
Personal Insights
Author's profile photo Santanu Ray

Data Science & SAP Analytics Cloud


This blog is in continuation to my earlier blog on Data Science

SAP Analytics cloud (SAC) is Software as a Service (SaaS) platform which is used for providing analytical capabilities to all users in one product. SAC has capabilities in areas of analysing, planning, predicting and reporting all in one place to reduce time and save effort.

My blog is related to how predictive analysis is embedded in SAC to simplify business understanding and how business can extract powerful information to take any decision related to future planning.

The whole idea of this blog is to help Business Users understand the use of predictive analysis and how SAC can simplify their life. I do not intend to use mathematical analysis or complex logic of inferential statistics or advance hypothesis. I strongly believe in Business first policy and IT is an enabler in making Business life easy. So my blog focusses on the usage of statistics without getting into how the formula works or is derived.

Source: SAP

Smart Predict

To help Business Users SAP have integrated automated predictive feature into SAC. Mostly in all Business there is a requirement of 3 main predictive techniques. Hence SAP has given the functionalities that can cater to most popular demands of Business.

The predictive techniques are

  1. Classification: Simple definition is data mining functions that derive target categories or classes.


    1. Credit card application – Which group of people may be found to be fraudulent?
    2. Energy and resource- Which area or age group will have demand for roof top solar panels?


2. Regression: Simple definition is relation between certain variables to give a cause & effect                   relationship.


    1. Retail example: What influence sells? Which factor has cause effect relationship on my revenue?
    2. Supply Chain: Price of Raw Material has a direct relationship with Oil Price. Which other factor has a cause and effect relationship on logistic?


3. Time Series: Simple definition collection of data over a specific period and predict behaviour               over a span of time.


    1. Plant or Factory: When we can have a machinery breakdown in future?
    2. Retail: Sale of gold will increase during a certain festive season. When it is right time for my business to enter the market?

A brief overview of data sources

We can leverage various data sources, some of them being:

  1. Excel or csv.
  2. Generic OData sources.
  3. SAP Applications
  4. SAP S/4 HANA sources CDS views
  5. SQL Databases.

In most of the cases your ABAP or UI5 consultant can help you by providing the CDS views if you have S/4 HANA. For excel and csv it is a simple upload.

Out of the three-predictive technique, I shall try to explain some interesting case studies on Regression Model. It’s easy and simple and can give you some of the best predictions to quantify the cause effect relationship.

Simple Linear Regression:

Ice Cream Sale: I have just plot a graph in excel and you can create some value and a similar graph for your understanding. SAC will be used for multiple regression when we have more than one factor.

Temperature (C) No of Ice cream sold
25 130
26 135
30 150
37 200
38 210
32 155
20 120


So, If I want to predict the sales of ice cream when temperature is 40’C I shall use the simple linear equation

Y=mx+c=5.1054*40+5.4384=209.54=210 Cups of Ice cream.

R2: As of now let us just understand higher the R2 better is the model. In our simple case it is ideal.

Multiple Regression

If I want to consider some more factor which might be affecting my ice cream sell. This will give me a direction what should be my strategy of selling ice creams.

So now my formula would be: y= c+ m1x1+m2x2+m3x3……mnxn.

To keep it simple I am considering the effect of rainfall on the sale of ice cream. Then we may define and find out how we may run a regression model in SAC.


Temperature Rainfall (MM) Revenue
24.56688442 0 534.799
26.00519115 0 625.1901
27.79055388 0 660.6323
20.59533505 0 487.707
11.50349764 20 31.2402
14.35251388 15 36.9407
13.70777988 17 30.8945
30.83398474 0 696.7166
0.976869989 0 55.39034
31.66946458 0 737.8008
11.45525338 10 85.39
3.664669577 15 71.16015
18.81182403 0 467.4467
13.62450892 0 289.5409
39.53990899 0 905.4776
18.48314099 0 469.909
25.93537514 0 648.21
42.51528041 0 921.5083

Steps in SAC

  1. We go to Menu
  2. Select predictive scenario.
  3. Select Regression

    4. We create a prediction model for Ice-Cream and we train our model on our source data.The training dataset observation are the foundation of our predictive model.

   5. We get certain values which are explained below in a simplified manner considering the scope        of this blog


Root Mean Square Error (RMSE): Measures avg. difference values predicted by my model vs. the actual value.

Prediction Confidence: Measure of the accuracy of predictive model. For reference, 95% or above is considered a very good score and 85-95 is still considered good.

Descriptive Statistics

Mean: Average of dataset.

Std. Deviation: Dispersion of data set.

6. The Influencer: the picture describes it all. It is the relative importance of each variables used in      predictive model. In our case it is rain and temperature

  7. The most crucial information

Let’s keep this simple  and please note the whole idea that SAP is trying to make is make it easy for all.

  • Validation-Actual: actual target value as a function of prediction. (y= c+ m1x1+m2x2+m3x3……mnxn)
  • Perfect Model: All prediction is equal to actual values
  • Validation Error min & max: deviation of my current predictive model.


  • Validation & Perfect Model: Matches hence the predictive model is accurate

Hence, we can conclude that Rainfall and Temperature are two strong influencers in predicting the sales of ice cream. Although generally temperature seems to bigger contributor, but Rain has a higher value and that the beauty of this model.

Other simple case studies which could be considered for multiple regressions are

  • Trend of Employee Performance and multiple factors which influence the trend.
  • Key factors to control the cost of production.
  • Trend of sale of a product with certain influencing factors like location, price, promotion etc.

My next blog will be on how we can use Inferential Statistics in a very simple and easy way using R in SAC.

Assigned Tags

      You must be Logged on to comment or reply to a post.
      Author's profile photo Anirban Dutta
      Anirban Dutta

      Excellent blog

      Author's profile photo Santanu Ray
      Santanu Ray
      Blog Post Author

      Thanks Anirban.

      Author's profile photo Sudoti Roy
      Sudoti Roy

      This is an excellent blog that i have come across after a long time. Thanks for putting this together so nicely.

      Author's profile photo Santanu Ray
      Santanu Ray
      Blog Post Author

      Thank You Sudoti.

      Author's profile photo Tony Shea
      Tony Shea

      Great insight to traditional predictive analysis and how it is represented in SAC

      Author's profile photo Santanu Ray
      Santanu Ray
      Blog Post Author

      Thank You Sir. Grateful

      Author's profile photo upendra adarkar
      upendra adarkar


      very well explained in easy to understand way.

      Author's profile photo Santanu Ray
      Santanu Ray
      Blog Post Author

      Thanks a lot Upendra Ji.

      Author's profile photo Rituparna Chowdhury
      Rituparna Chowdhury

      Well documented approach for data science. Good read.