Skip to Content
Author's profile photo Dean Farrow

Back to Basics with Predictive Modelling – Getting the data right.

Having predictive analytics and predictive models where more insight and value is gained from data is on the wish lists of many organisations today.  The ability for organisations to use predictive modelling tools to help predict outcomes such as which customers to target, which customers are more likely to buy other products and what is the likelihood of customers leaving is something that is more achievable than ever before.  However, when we ask a business question from our predictive model to solve we must not forget the importance of getting the data right.


Clearly, if you put rubbish data into a predictive model then it doesn’t take a genius to work out that a rubbish model will be generated.  Predictive models are only as good as the data going into them.  Making sure that the source data that you use is properly managed and organised is key to this.


Extraction, transformation and loading tools have been around in the marketplace for many years, however many organisations are using out of date tools to just “lift and shift” data from one environment into another.  This process of extraction, transformation and loading of data (or ETL) is a key area where problems and issues in your data can be identified and rectified before they end up in the models.  Ideally problems and issues should be resolved in the source systems, however this is not always possible.

Here are some key areas where help is needed on data: 

  • Removing duplicates
  • Integrity checks
  • Names and address checking
  • Text checking to pull out sentiment in the data
  • Putting in place rules and analytics to check the quality of the data


Coding SQL is never a scalable and sustainable option –  imagine having to trawl through code just to change one business rule.


SAP have a complete range of tools that provide functionality to profile data, add in business rules and move data without using code.  Data Stewards can create repeatable jobs that are easy and quick to maintain all through graphical interfaces.  For more information about these tools click here

Assigned Tags

      Be the first to leave a comment
      You must be Logged on to comment or reply to a post.