Skip to Content
Personal Insights

How to migrate data to Commerce applications

Introduction

Data Migration is an activity to migrate/move data between systems. Commerce (B2B/B2C) implementations require data migration from various systems to the commerce system. Application stability is dependent on data accuracy. This post explains the Data Migration process and how to validate the data accuracy as part of the Data Migration. This blog is specific to migrate large scale data sets from one system to another system (Asynchronous data transfer).

What is Data?

Data in commerce implementations is categorized as

  1. Product Master Data – Source is ERP (or) PIM systems.
    • Products / Materials
    • Price
    • Promotions
    • Product references
    • Product Enrichment
  2. Enrichment- Source is Legacy systems, Content Management systems
    • Media files
    • Product description
  3. Customer master – Source is ERP (or) Legacy systems
    • Customer
    • Profile information
    • Payments
    • Addresses
  4. Payment – Source is ERP/Legacy/Payment gateway
    • Payment tokens
  5. Order Master – Source is ERP (or) Legacy systems
    • Orders
    • Returns
    • Order Status
  6. Any other custom objects- Source is ERP (or) Legacy systems

Commerce Data Migration Process?

Data Migration is an incremental process. The below table explains the data migration process

Scope Definition Build Load
Initiation Data Requirements / Profiling Data Migration build Data Extraction Data Transformation

Data Cleansing /

Enrichment

Data Loads

 

Scope Definition

Strategy and approach

Commerce Data Model

 

Source to Target field mapping

Profiling rules based on Commerce logic

Generate Profiling report from Source system

 

Develop Data migration pipeline

Cleanse data anomalies in the source system

 

Extract all source entities and fields required for target data migration

Filter records that are in migration scope

 

Tailor the source data to accurately fit target structures

 

Enrich data that is not available in source systems

 

Load of all converted and validated data

 

Initiation: Data Migration Scope

This process defines the objects, attributes for the migration scope. The below information is captured in a scope definition

  • Objects / Types
  • Data rules to migrate the objects – (Status, duration, migration criteria)
  • Exclusion rules
  • Profiling Requirements from commerce systems

Sample scope definition sheet

 

Initiation: Data Migration Strategy

This activity defines the strategy for data loads. Strategy includes how the data is getting migrated – One time / Phased.

When the existing system is replaced with commerce, the customer data needs to be migrated from legacy to the new system. Based on the criticality/complexity of the migration applications follows different customer deployment options

The below criteria helps the teams to define customer deployment strategy

  • Volume
  • Complexity
  • Scope of retiring old system

The below details are captured as part of strategy sessions

  • Data migration strategy – Phased migration or one-time load
  • Strategy for initial load
  • Strategy for delta load
  • Support multiple systems incase of phased migration approach
  • Strategy to handling of inflight data
  • Strategy to handle data failures
  • Mock data load strategy
  • Test data load strategy
  • Production data load strategy

Data Migration Build: Environment Readiness

Define environment strategy for testing, staging, and production loads. In many cases, the source data is staged in a staging area to run profiling rules, and enrich before the data is imported in commerce systems.

Typical Data Migration follows Sorce System (Data Extract) -> Staging DataBase (Cleanse and Enrich) – > Target System (Load and Profile)

The environment structure should support the below requirements

  • Staging Data
  • Generate Load Scripts
  • Generate Mock Loads
  • Test Data Loads
  • Source System
  • Target System

Data Migration Build: Data Mapping

Define data mapping between source and commerce systems based on the scope defined. This information is critical to build extract scripts from the source system.  For each entity collect the below information

Create source and destination system mapping

  • Table
  • Attribute
  • Type
  • Length
  • Description
  • Required / Optional
  • Default Value
  • Possible Values
  • Restrictions
  • Hardcoded values
  • Impex header – static header
  • Comments
  • Cross-reference table
  • Transformation Rules
  • Criteria to accept the data (Password criteria)
  • Inclusion and exclusion rule
  • Cross Reference tables
  • Table name
  • Column name
  • Possible values
  • Sequencing of data  and elements

Example:

  • Table: Cart
  • Attribute: id, name, product, quantity
  • Type: String, Number, Date
  • Length: 10
  • Description:
  • Required / Optional: Required
  • Default Value: 10
  • Possible Values: P1,P2
  • Restrictions:
  • Hardcoded values
  • Impex header – static header
  • Comments

Workshop agenda and deliverables:

  • Map Legacy attributes from Source to Commerce
  • Map cross-reference legacy attribute values with Hybris values
  • Identify and document transformation/conversion rules to be applied during data migration

Data Migration Build: Data Profiling

Profiling defines a ruleset to measure the data accuracy based on target systems needs.

Generate the profiling reports based on the rules defined in the mapping document. This report explains the data quality based on commerce business requirements / technical limitations (These are data conditions). The issues are categorized with different priorities based on issue criticality. Based on the profiling report the data team to perform the below activities

  • Correct / Cleanse the data in the source system
  • If the data cannot be corrected then define explicit handling in the commerce system to address the data cleansing issues.

Perform the below steps and collect the information in Profiling sessions

  • Define profiling rules
    • Check the data accuracy for each data element based on the mapping document
    • Uniqueness and duplicate records
    • Invalid data – invalid email address, SSN, Phone number
    • Invalid address or postal code
    • Validate data based on mapping rules
      • Missing default values
      • Missing reference objects (billing and shipping profiles)
      • Missing payment information
      • Inclusion rule requirements
      • Exclusion rule requirements
      • Attribute rules
      • Type definition
      • Null or blank check
      • Criteria to accept the data
      • Restrictions
      • Special characters
      • Boundary conditions of the data
    • Extra record more than expected
    • Duplicate unique identifier
    • Right login format
    • Invalid payment information
    • Validate the data based on rules
      • Not meeting boundary condition – Only one cart should be to a customer
    • Element with missing required fields
    • Data to support a business rule. Date should be Jan 01
    • Requirements from destination system
  • Define a profiling report and identify the issues
  • Define a cleansing plan and fix the issues

Data Cleansing

Data cleansing is a data correction activity in the source system based on the profiling report. Define a clear action plan to correct data in the source system (production). The data to be corrected prior to user acceptance test to make sure the user acceptance testing is happening with correct data. Perform cleansing multiple times before the data is loaded in commerce to avoid last-minute data changes in the source system.

Data Extraction and Transformation: Data Loads

  • Define data load plan for each element
  • Develop load-scripts based on data migration scope and mapping requirements
  • Make sure to have profiling scripts in the commerce system (Flexy queries). These scripts are going to help the team to validate the accuracy post data loads into the commerce system.
  • Mock loads
  • Mock Loads: Are test loads in test environments. These can be incremental based on the development plan. May not load all objects at once in the development phase. The team to plan multiple mock loads (Initial and Delta) in test environments with a full set of data. This will help the team to test all scenarios.
    • Generate the mock loads files from source system
    • Mock loads can be generated using load scripts to validate the functionality of the script
    • Validate the mick loads and test the functionality
    • Load live data in test environments
  • Production Loads:
    • Generate the actual load’s files from the source system
    • Validate the Live loads and test the functionality
    • Production data loads

Conclusion

Data Migration into commerce systems is a critical task of an implementation. Make sure to have a clear plan for scope definition, profiling, cleansing, test loads and production loads. Few things to consider

  • Define Data Migration Scope
  • Create Mapping documents between source and target
  • Perform Moch Loads in test environments multiple times
  • Generate Profiling report from production
  • Define a clear action plan to correct the data issues in source system (Cleansing)
  • Practice data migration (onboarding) multiple times in testing phases
  • Define a clean delta load process
  • Use the staging DB, so that the production system is not impacted
7 Comments
You must be Logged on to comment or reply to a post.
  • Thanks for the detailed overview of the Data migration process. The article covers all the import aspects of the data migration process like Scope, Strategies, Process, and Model.

  • Good post Akhil! Data migration is usually an overlooked area in transformation projects. PMO and development team are busy planning for To-Be Process and pays less attention to the point how data will be migrated to the new system which results in last minute fire-fighting. The project team needs to plan for data migration from starting of the project, follow an established process and review the outcomes iteratively.

  • Great Post Akhi! Data Migration is often an overlooked area. The article provides detailed steps required in the data migration process.

  • Thanks Akhil for covering in detail for an important aspect of any ecommerce project which in general is overlooked. Details are very intrusive and great explanation. Very helpful and process oriented to follow and get the task done for upcoming projects.