This component extends the capabilities of SAP Predictive Analysis and calculates certain measures of location:

– Mean

– Confidence Interval of the Mean

– Standard Deviation

– Minimum

– Maximum

– Total Record Count

– Record Count of non-null Values

d2.JPG

Prerequisites

The dataset must contain at least one numerical column, which will be described with the measures of location.

Usage

These parameters can be set by the user.

Parameter Description
Group by Column Name of categorical column for level-based statistics. Calculates the measures of location for each subgroup. If the column “Country” is selected for instance, then individual statistics will be calculated for each country, ie Switzerland, China and Brasil.
Measure(s) to Describe Names of one or more numerical column that are to be described, ie Revenue, Duration, etc.
Calculate Overall Statistics Controls whether the overall statistics of the unfiltered dataset are to be calculated.
Confidence Level for Mean Interval Confidence level for calculating the lower and upper limits of the mean interval.

Output Columns

Column Description
GroupByColumn The levels of the group by column which is described by the row, ie Switzerland, China or Brazil. The statistics of the unfiltered dataset (if selected) is labelled “OVERALL”.
Measure Name of the measure that is described in the row.
Mean Mean.
SD Standard Deviation.
Min Minimum.
Max Maximum.
Count Row count.
CountNotNA Row count of non-null values.
MeanConfidenceLevel Confidence level for the mean interval as entered by the user.
MeanCILL Lower limit of the mean’s confidence interval.
MeanCIUL Upper limit of the mean’s confidence interval.

How to Implement

The component can be downloaded as .spar file from GitHub. Then deploy it as described here. You just need to import it through the option “Import/Model Component”, which you will find by clicking on the plus-sign at the bottom of the list of the available algorithms.

Example

Let’s try this component on the responses of a customer survey carried out by the airport of San Francisco. Download the dataset from the year 2011, load the data with SAP Predictive Analysis and add the “Measures of Location” component to the dataflow.

02.JPG

Configure the component to

– analyse the satisfaction with SFO Airport as a whole (column “Q8N”).

– calculate the measures of location by the respondent’s country of residence (column “Q18COUNTRY”).

– calculate also the overall measures of location for the unfiltered dataset.

03.JPG

Run the component and see the output.

04.JPG

You can further process the results, for instance by graphically analysing them with SAP Predictive Analysis.

05.JPG

Disclaimer

Please note that this component is provided as-is without any guarantee or support.

To report this post you need to login first.

Be the first to leave a comment

You must be Logged on to comment or reply to a post.

Leave a Reply