Skip to Content

This article briefly describes the 3 cornerstones to ensure high Data Quality in CRM: Address Validation, Duplicate Check and Data Cleansing. It also describes options how this can be done in a more automated process.

In the past I’ve posted already some content for this community about merging redundant Account Records in different Scenarios (the related Links are listed at the end of this Blog).

Well, to take care for Data Quality in CRM is a very important topic and I like to add some of my thoughts here again …

Some day’s ago I’ve seen the famous movie “Matrix Revolutions”. I guess most of you have seen this already. In “Matrix Reloaded” Agent Smith has started to create duplicate records of himself and in the next episode (“Matrix Revolutions”) this was not only a threat for our heroes Neo, Morpheus and Trinity, but for the whole system.

In CRM that’s very similar. If data quality is not ensured not only single users are facing serious problems – at the end the whole system will not longer work efficiently.

The Sales Rep. is not able to identify the right Accounts he should focus on. A Marketing Campaign is very expensive and can’t reach the right potential customers. Analytics will work with wrong data and this will cause wrong conclusions and wrong decisions.

For sure it’s easier to get rid of duplicate Account Records in CRM as to get rid of Agent Smith in “Matrix Revolutions”. However it’s easier to avoid duplicates and I should not longer talk about Science Fiction …

This whole topic is very important especially in CRM as this is the main area where new Account-Records gets entered even if only some parts of the address- and communication data are known. Data Quality in CRM is very critical as this very often is a neglected topic. If a server crashes or if any functionality doesn’t work a company immediately takes the right action. Bad Data Quality is rather acreeping disease.

The first important aspect is that every user is aware about his responsibility for good data. It’s not only him, who is working with the data and not sufficient, if he just maintains a phone-number because everything else is not important for his current tasks.

In this case it is very helpful to have a clear assignment of responsibilities. However this awareness will never be sufficient and usually it fails if users are just forced to enter accurate data.

It’s something every customer has to take care for within his CRM-Solution to also implement solutions supporting the user in entering correct data, to prevent the entry of duplicate records and to remove redundant records.

Online Address Validation

Here a first important option to support the user is to integrate a powerful Address-Validation which automatically validates the address based on referential data sources, and format the address according to the norms of the applicable country.

So if the user enters an Address the system will automatically suggest a corrected and completed Address the user can select. Dependent on the logic of the Solution for Address Validation some other address-related information like Tax Jurisdiction Code or Geocoding-Data can get added. In addition some information can get driven out of the combination of Address and Name to automatically update fields. E. g. the First Name “Andrea” could indicate the gender of the person but this is dependent on the country. In Germany “Andrea” is most likely a name for a female person but in Italy it is rather a masculine name.

In Korea “Kim” is rather a last name, but in other countries it is rather the first name.

There are several solutions available at the market which can get integrated for this purpose. In principle all these solutions are based on the Business Address Services Interface (BC-BAS-PV for Business Address Service Postal Validation).
Several 3rd Party solutions are certified on this Interface.
You can find several 3rd-Party-Solutions within the Partner Information Center if you search for Solutions based on BC-BAS-PV (Business Address Service Postal Validation).http://www.sap.com/partners/directories/SearchSolution.epx

Here you can also integrate “BusinessObjects™ Data Quality Management for SAP® Solutions”which works for SAP ERP and SAP CRM.

Further information about “BusinessObjects™ Data Quality Management for SAP® Solutions” you can find here:

http://www.sap.com/solutions/sapbusinessobjects/large/information-management/data-quality-management/dqm/dataqualityforsap/index.epx

Online Duplicate Check

Based on complete and accurate Address Data the next level to ensure good data quality is the online Duplicate Check. The Online Duplicate Check works whenever a new Account gets created or if relevant fields get edited. Whenever there are similar Account-Records (above a defined threshold of similarity) the user can check, if the current record already exists in the system and if he wants to continue with an existing record, still create a new record or if he wants to merge duplicate records.

In CRM you can use here the lean TREX-and-BAS-based duplicate prevention functionality (BC-EIM-IQM-IC) which is part of SAP NetWeaver. However it is also possible to use more sophisticated 3rd-Party-solutions. Here again you can find some 3rd-Party solution within the “Partner Information Center” if you search for Solutions based on BC-BAS-DES (Business Address Services – Duplicate Check)http://www.sap.com/partners/directories/SearchSolution.epx

Sophisticated Duplicate Check capabilities are also integrated in “BusinessObjects™ Data Quality Management for SAP® Solutions”
(http://www.sap.com/solutions/sapbusinessobjects/large/information-management/data-quality-management/dqm/dataqualityforsap/index.epx)

A lean duplicate check only considers fields of the Address and the Name of the Account. A more intelligent Duplicate Check also has some kind of fuzziness and logic to identify potential duplicates. As an example “Beth Meyer” and “Elizabeth Mayer” could be duplicate records for the same person. “General Electric” and “GE” could be duplicate organizations.

In addition other attributes not belonging to the Address are also relevant to identify duplicates (e. g. Birth date, Identification-Numbers, e-Mail, Phone-Number etc.).

Data Cleansing

Usually the Duplicate Check can prevent the creation of many duplicate Account Records … but not all. Duplicates are always in the system and usually the number is growing. Therefore it is required to create Cleansing Cases in order to merge redundant Account Records. This functionality is available in CRM and is in principle also independent from the Duplicate Check as such. The option to create Cleansing Cases is available for all Account Search-Result-Lists and in addition (if a Duplicate Check has been activated) on the Duplicate Check Popup.

The creation and the processing of Cleansing Cases are decoupled. This means the User can create a Cleansing Case and he merges the Accounts directly or he or another User merges the Accounts later.

At the end there should be one remaining consolidated and enriched Account Record and all Non-Master Accounts are flagged for archiving.

Data Quality Administration

For CRM users (especially for Sales Professionals), time has never been more precious as today. Therefore there has to be also an option to automate and to outsource tasks around Data Quality. The CRM Data Quality Administration Framework is one solution for this as it offers options to automatically de-duplicate Target-Groups including thousands of Accounts. It also offers the option to export Data-Files and to import validated and enriched Data from an external agency again.

/wp-content/uploads/2009/09/tasks_61332.gif

Additional Information

Overall:

… related to “BusinessObjects™

…  related to Data Cleansing:

… related to Data Quality Management:

To report this post you need to login first.

13 Comments

You must be Logged on to comment or reply to a post.

  1. Chetan Bhatnager
    Thanks for the nice blog, it is nice to know that SAP has finally integrated DQ with SAP CRM 7.0. We are in CRM 5.0 and it is great struggle and complexity we integrate DQXI with CRM 5.O and had to do lot of custom development in ELM list import process to check the duplicates with all possible combinations with organization as well Person data.
    thanks hopefully with upgrade to CRM 7.0 we can liverage and able to remove the complex custom code.
    (0) 
    1. Arno Meyer Post author
      Hello Chetan!
      Thank you for your comment!
      Merging Accounts and Duplicate Check is nothing new, but in CRM Web UI we have improved the logic and the UI.
      For sure “Data Quality Administration Framework” offers additional new options to automate DQ-tasks and also to run this as regular jobs.
      Please note “BusinessObjects™ Data Quality Management for SAP® Solutions” has also been tested within CRM Web UI (other 3rd-Party Solutions as well), but for sure it has an own pricing.

      You have also mentioned ELM. ELM is a special scenario. Beside of what I’ve mentioned in this Blog, I know that there has been done also a lot of effort to improve the Upload including Duplicate Check for External List Management. However I don’t know what is already available in CRM 7.0 and what will be available in EhP1.
      Best regards
      Arno

      (0) 
  2. Gabor Somogyvari
    Hello Arno!

    I have set up the nodes in the BUSWU01 and BUSWU02 as you mentioned, and the relevant OSS Notes mentioned.
    But I have got two missing objects.
    1. Business roles – BUP200. I have maintained class name: CL_BUPA_CLEAR_ROLES, and checked it relevant for data cleansing. Is it ok?
    2. Which node is necessary for the Sales are data?

    Thanks in advance,
    Gábor Somogyvári

    (0) 
  3. Harriet Hoogstede Loef
    Dear Arno,

    We are already for some time working on data quality. One of the steps is implementing the duplicate check. However, till now we are NOT successful in implementing the duplicate check in CRM7.0.

    We have used TREX search engine and BAS address service for BP duplicate check. This works fine in the Win GUI, displaying the duplicates
    found. However, when using the Web UI, it does not show any duplicates.

    We are not getting any error message. Somehow when using the Web UI the system does not recognize any duplicates and therefore the
    button ‘show duplicates’ remains grayed out. In the WIN GUI we get the message duplicates exist and a window is opened giving me the possible
    duplicates. You understand as a result duplicate recrods can easily be created.

    From your experience can you perhaps tell us whether specific Web UI settings are required to get the duplicate check working in the Web UI?

    Thanks.
    Harriët

    (0) 
    1. Arno Meyer Post author
      Hello Harriet!
      I’m sorry, I can’t answer this question. I would assume that the BADI implementation (ADDRESS_CHANGE) is not correct.
      Do you never get the Duplicate-Popup and is the “Show Duplicates” Button never active?
      Is there any difference between Create Business Partner and Change Business Partner?
      I guess the best option is to create an OSS-message and have someone to investigate this issue in detail.
      It would be great if you would let me know what the reason was, once the issue has been solved.
      Best regards,
      Arno
      (0) 
      1. Frederic Hadjri

        Hi Arno,

        Is it possible to add new attributes to the BP duplicate check on TREX. I want to add the e-mail field. I followed note 1692370 and added the e-mail attribute SMTP_ADDR from table ADR6 into table SIC_BAS_FIELDS and TSAD10. Then in transaction IQM_CM_CONFIG, Maintain Service Profiles, I changed the service SEARCH and changed the formula to consider the SMTP_ADDR field.

        When duplicate check is executed, the e-mail field is not considered. The e-mail field does not seem to get retrieved by the search.

        I would appreciate an answer because it is very Urgent.

        Best regards,

        Frederic

        (0) 
        1. Jutta Weber

          Hello Frederic,

          Even though you can add the e-mail attribute to the interface it is not handled by the duplicate check on TREX. No duplicate check is done based on the e-mail attribute.

          Reason is that on TREX side this has not been implemented.

          There is a new HANA-based version available, which is the future direction:

          http://scn.sap.com/community/crm/master-data-and-middleware/blog/2013/10/04/duplicate-check-in-crm-on-hana

          Hope this helps.

          Regards,

          Jutta

          (0) 
  4. Raimundo de Oro-Pulido Albo

    Hello,

    Great post. What about Organizational Structure maintenance? Is a good practice to clean the structure and keep it as simple as possible? Is there any risk in deleting old agents / positions in the structure?

    Thank you!

    (0) 
    1. Jutta Weber

      Hello Raimundo,

      Unfortunately I cannot help you with this question. In general it is always good to keep a structure as clean and simple as possible.
      But I don’t know if there any specifics that need to be considered when it is about an organizational structure. Perhaps you cannot delete old agents / positions immediately because of legal regulations that could be country-specific.

      Regards,

      Jutta

      (0) 
      1. Raimundo de Oro-Pulido Albo

        Hi,

        Thank you for your reply. I don’t think that there is any legal requirement here, as all the information related with the document still there (at least in Spain).

        I’m just thinking about performance. As the commercial structure is growing and changing every week, deleting old structures (instead maintaining just the valid-to date), will increase the performance.

        But I’m not sure 100% if there are any technical dependencies behind (I think not, but…..)

        Thank you again,

        Mundo

        (0) 
        1. Jutta Weber

          Hi Mundo,

          I don’t know the technical details about organizational structure. Unfortunately I cannot help you here.

          Regards,

          Jutta

          (0) 
  5. Bernard Le Tourneur

    Hi

    Just wondering if you would not consider Information Steward as a candidate here given its great profiling and rule building capability. Plus the trend of data quality over time is depicted graphically and the detail can be drilled into.

    Have you ever used Information Steward?

    Best

    Bernard

    (0) 

Leave a Reply