SAP Predictive Analytics
SAP’s advanced analytics solution aimed at advanced business analysts and data scientists to analyze and visualize their data using powerful predictive algorithms, the “R” open-source statistical analysis language, and in-memory data mining capabilities. “SAP PA 1.x” is built upon the SAP Lumira codebase which also gives it excellent advanced visualization and data discovery capabilities as well.
SAP InfiniteInsight 7.x:
SAP’s automated data preparation, predictive modeling, and scoring solution that allows business users to easily and quickly find meaning in their data without requiring the skills of a data scientist. “SAP II 7.x” is at the forefront of automated predictive analysis and includes the product set from SAP’s acquisition of KXEN in 2013.
SAP Predictive Analysis 1.x enabled users to analyze and visualize their data using pre-built algorithms from the open-source “R” library and graphically “chain” these modules together to perform complex analysis without a technically challenging and tedious manual modelling process.
Then in 2013, SAP acquired KXEN – which made a product called InfiniteInsight that enabled business users to automatically analyze their data without manual modelling or even requiring the skills of a data scientist or statistician. SAP InfiniteInsight 7.x contains its own intelligent and self-tuning algorithms that encapsulate much of the manual preparation and modelling work a data scientist would typically do so business users can focus on answering their business problems instead of deciding which algorithm to use and when.
SAP Predictive Analytics 2.0 brings these two products together into a single installable solution and contains the functionality and experiences of both products. But just to make things interesting, we have also changed the product name slightly – the unified solution is now called “SAP Predictive Analytics” and not “SAP Predictive Analysis“
You can see old version snap shot. You can load your example excel straightaway and perform initial learning on tool very easily.
New SAP Predictive Analytics first screen looks like below and i struggled for some time to perform initial analysis learning through excel for a while.
In data manager uploading excel option is not available.
I did analyis through AA(Automated Analysis). You can start your excel data analysis through modeler.
Data manager you can see various data base connection options as below. Loading excel is not straight forward as it was avaiulable in SAP Predictive Analysis tool earlier.
Go to AA and Modeler and you will see option below to select file
I did some intial analysis and learning of tool with Country controllers Vendor With holding tax data as explained and shown in below snap shots.
You have options to select text file or CSV file. I prepared Text file to perform intial analysis. You can preview data as shown below.
First learning is to select less variables to perform analysis (For beginners like me ).
Although i have shown below many variables i would suggest to start with 4-5 variables only in the begining.
You have option to add filters on existing Data Set.
You can train the model with small set of data.
Once training model is ready it is ready to use and apply on bigger data sets where you can also test training model accuracy.
In Display option you can check Variables contributions, statistical reports etc.
Once in Display you find model as expected it is ready to run as shown below.
Applying model you need to select the new set of data on which trained model to be applied.
You have options to check data ,statistics and Graph as shown below.
In below example we are using dataset contains information used to estimate undergraduate enrollment at the University of New Mexico. To predict undergraduate enrollment we will use three predictors –
- Unemployment rate (UNEM)
- Number of high school graduate (HGRAD)
- Per Capita Income (INC)
Variables – Target variable,weight variable, excluded variable and Explanatory variables(which are inputs variables) as shown below for current model.
Target Variable – ROLL
Input variables –
You can see data description as follows.
You can analyze model deviations as shown below.
You also have option to simulate the model as shown below.
You can see % contribution of inputs variables in current analytic model as shown below.
You can see trained model performance as shown below
Once you apply model on main data for analysis as shown below you will get target variable data as shown below
You can simulate the model by running the model for various values of inputs variables and check the consistency of model as well.
What’s new in SAP PA 2.2
Instead of just listing the new features/functions, let’s take a look at how SAP PA 2.2 moves us forward in pursuing some of our core goals (note some features address multiple goals, but I’m keeping it simple here):
AA = Automated Analytics
EA = Expert Analytics
HANA = Native on SAP HANA
A better, smoother experience for data scientists AND business users
- (AA) Very wide datasets (up to 15K columns) support: Automatically handle very wide datasets to improve both the efficiency and effectiveness of your predictive models
- (EA) Ability to share custom R and PAL components: Enable other users to use your algorithms with ease.
Making data scientists more agile and efficient:
- (EA) New Model Performance Comparison: Compare the performance of two or more algorithms and get a recommendation and detailed explanation for which one is the best to use.
- (EA) New Model Statistics: Calculate performance statistics on datasets generated by classification and regression algorithms.
- (EA) Support for R 3.1.2: To make it possible to use the latest libraries
- (EA) Support for multiple charts: Use more than one chart in your offline custom R components
Enabling customers to better leverage their existing data and investments:
- (AA)Support for SAP HANA Views: Connect directly to SAP HANA Analytic and Calculation Views
- (AA)Support for SAP BW on HANA: Use BW on HANA systems as a data source
- (EA)Improved BW acquisition: easier and faster variable selection and handling of hierarchies.
- (HANA)Updated Automated Predictive Library (APL): Now includes automated recommendation