Predictive Analysis is one of the most important and budding topic in today’s ICT world. The analytical demand of management is getting into new dimension in every part of the world. This is depended not only on current data but also historical data which can provide in-depth insights to companies with the help of various statistical prediction and algorithm.
According to SAP predictive analysis was out of reach to Organizations mainly due to
- Lack of Quality data
- Not enough powerful processing resources
- Algorithms that were too generic and was not capable to handle huge volume of data.
- Lack of skills in developing predictive analysis and understanding the result.
The shift of SAP database from traditional row store to column store is one of the key developments which will cater to this future demand of Business to provide them with data which is not only descriptive but also predictive and prescriptive.
I have tried to bring out some very basic use of predictive analysis in different filed as mentioned below
|What is the probability that a customer will buy New Socks if he buys Shoes? At a very simple level not considering market basket analysis|
|P is Probability||p(Shoes) Customer buying Shoes||0.55|
|p(Socks) Customer also buying Socks||0.25|
|What is the probability that a candidate will join the company even though he has rejected the offer last time during a Giant Recruitment Session|
|Number of Candidate did not join last recruitment||80|
|Number of Candidate joined during this recruitment||20|
|Conditional Probability (Joining Candidiate|Not Joining Candidate)||=20/80=.25|
|What is the probability if we provide a discount to 100 customers 25 will accept my offer. If I have a probability of history data to be 17%|
|Binomial Distribution formula in Excel.||0.0119351|
Considering this is my first blog of Data Science I have tried to keep it simple and with time I shall come up with some more test cases and case studies on complex scenarios in near future.
However, some other key areas can be
- Healthcare: Clinical trials to meet any pandemic
- Retail: Identification of location for opening a retail store.
- Telecom: What will be the future of customer demand and how it is changing with generation of 5G.
- Energy and Resource: Analysis on the demand of customer for Rooftop Solar Energy.
- Banking: Fraudulent customer avoidance.
I shall try to briefly explain how we, SAP HANA Consultants can cater to this demand of Data Mining.
SAP HANA has a large library of built in Algorithms ready to use. It is also possible to import more algorithms from the public R library and to develop our custom algorithms.
We can utilize these algorithms in various analysis which can be in different areas like
- Regression — Relationship between variables. (y=mx+c). e.g. If temperature increases sell of ice-cream will increase.
- Time series — Analysing data based on time. Example: Sell of stylish apparel increases during X-mas.
- Associations — relationship between activities or behaviour. Example If I buy new shoes I can buy a pair of socks too.
- Classification — Prediction of target class. Example: Bank might use this to find fraudulent customer before accepting credit card application
- Cluster — Grouping of set of similar behaviour or type. Example: Intelligence dept might use clustering to find out the similar crimes happening in areas to find further reports.
How to perform Data Mining to provide analytics and predictive analysis to Business in SAP HANA?
I shall try to simplify and keep my blog limited to SAP HANA studio. The SAP HANA studio is a multipurpose interface used by variety of SAP HANA experts. The SAP HANA studio provides different screen layout for different audience. The screen is organized with different screen content and tool for different type of SAP HANA Persona. We call them Perspective.
In this case we shall use Modeler Perspective.
In Modeler perspective we can access two types of content.
- Catalogue Objects: tables, views, synonyms etc.
- Content Objects: Calculation Views, Procedure, Decision tables etc.
Soure: SAP AG Image
Application Function Library (AFL): Although AFL is not part of the HANA Appliance, but it can be installed by the administrator. It is mandatory requirement for predictive data modelling.
Inside AFL we have Predictive Analysis Library and Business Function Library.
The SAP HANA Application Function Modeler (AFM) in SAP HANA Studio supports function from BFL in flowgraph model. We can add BFL function to the flowgraph by using AFM. We can generate the procedure without writing any SQL Script.
Once this is done we can create a view and consume the view with CDS to provide a tile in Fiori to the user. This will enable the user to execute Data Analytics as per their choice of data on the specified filed they are working.
I shall try to bring some more detail step by step procedure in my next blog on how we can implement a statistical model on our data set and provide it to user through a Fiori tile