Skip to Content

Data Mining – An Introduction

How often have we wondered as to how companies try to understand and analyze the requirements of the customers or the clients whom they service? Gone are the days when manual data search was done to understand a pattern emerging or some kind of trend shift towards a particular product or service or business offering. Technology today has grown leaps and bounds, providing complex tools and techniques that aid in unraveling the mammoth amounts of data to unearth some meaningful information which is converted into usable, executable and profitable knowledge.

Ever wondered how these tools work? What are the backend processes that move the data search engines. I have tried to explain that

Data Mining refers to extracting or “mining”   knowledge from large amounts of data. It’s a process where intelligent methods are applied in order to extract data patterns. These data patterns sometimes are so subtle and unnoticed, offer tremendous information to the companies about the user needs and choices. The methods available for Data Mining are using some complex algorithms which are the hot topics of researchers. Our goal is not to discuss those algorithms, but to sketch the methods that SAP delivers in SAP BI to support Data Mining process.

SAP has delivered Data Mining methods and Business Content to help organizations identify potentially significant patterns, association, and trends that otherwise would have been too time-consuming to uncover through conventional human analysis or that was missed, since analysts tend to see the information they are expecting to discover and may miss valid information that lies outside their expectation zone. Many a business venture has failed in the past because of precisely this component of the task going wrong.

Following are the methods which are provided by SAP for using Data Mining Functionality.

  1. Classification
    • Decision Tree
    • ABC Classification
    • Scoring
      • Weighted Score tables
      • Regression Analysis
        • Linear Regression
        • Non-linear Regression
  2. Clustering
  3. Association Analysis


Classification is the process of finding a set of Models that describe and distinguish data classes or concepts, for the purpose of being able to use the model to predict the class of objects whose class label is unknown.

Clustering groups data into segments based on associations between different characteristics in the data. Clustering divides a set of data so that records with similar content are in the same group while records with dissimilar content are in different groups. The fundamental paradigm involved in this process is to understand the similarities in properties or content and sorting out the dissimilar ones.

Association Analysis is used to develop rules for cross-selling opportunities. In my Analyzing Customers buying trends using Analysis Process Designer I covered good example of Association Analysis.

Following are the processes involved in Data Mining methods

  1. A Data Mining Model is created.
  2. The Model is trained.
  3. The Model is evaluated.
  4. Predictions are made.
  5. The results are stored or forwarded.

Data Mining Model creation requires some configuration settings and parameters which are required by the method which we are using.

Training of the model is done on a small set of data. This process is generally used where prediction of data is required. After training the model we can also evaluate the model on the basis of some small historical records.

After training and evaluation of the model, Prediction should be made on yet another source which is historic data so that it separates from what was used in training and evaluation. Predicted values may be uploaded to SAP BW or forwarded to other systems (such as, cross-selling association rules to operational CRM).

In this Weblog I gave you an overview of Data Mining methods which are supported by SAP NetWeaver Data Mining tool. Every method works for different market scenarios. It could be a customer segmentation analysis, campaign promotion effectiveness analysis, vendor performance analysis, budgetary analysis etc. My subsequent weblogs will cover all these methods and certain suitable market situations and scenarios in which they will fit in aptly.

1 Comment
You must be Logged on to comment or reply to a post.