I in this blog will be discussing the importance of Delta import in MDM . This feature is handled through default import actions of the IMPORT MANAGER and on effective use can enhance, filter and control over the delta Load.
DELTA LOAD: This is a more acquainted term for people who belong to ETL world . During data load the data is generally loaded in batches and once the first batch is loaded, Then any new incoming data or changed data in the existing will be termed as delta load and will be loaded in consecutive batches.
Here I am considering a Scenario through which I will be displaying this challenge handled in MDM. Just have a look at the source created below. There is no such field which can be claimed to be primary key of the table. In this case the Import action can be leveraged to strengthen the DELTA Load.
The fig. below presents the scenario before the first import . It’s clearly visible that the “active” grid shows 11 records to be importing which is the total no of records buffered or read.
Note: But records will be actually imported as per the uniqueness of the matching field.The staus of which can be viewed in the “Import Status” tab
Note : During the first import there’s no use of keeping the included fields separate coz they doesn’t have any effect on the import but they do have an impact on the second import. On importing such, the records will get imported as per the uniqueness of the leading field in the “Matching Fields “list.
Upon combining the fields , at least the maximum of the combined uniqueness would be imported irrespective of the order in which they are included and combined.
Now Let’s have a look at the source which is to be used for the second import . This is actually some new records added to the initial table as well as some changed existing records.
During Second Import..
Match Type : Match Type may be of four different kinds namely 1)none 2)exact 3)partial 4) Conflict . It shows the nature of match of the individual records on the source side with respect to the destination side with the field/fields in the “matching fields“ as reference .
Note : The Match type feature is not case sensitive . It compares only the text not the case of the text. None : If the matching field/fields of the source record does not match with the destination record .
Exact : If the matching field of the source record completely matches with the destination record.
Partial : It comes to effect only if more than one field is taken for matching field as individual (not combined) and it signifies that out of the (n) no. of matching fields taken into account , at least one field is empty(null) in the source side and other field values on the source side matches with the corresponding destination side .
Conflict : It’s same as partial but here at least one matching field value in the source side does not match with the destination side and other field values on the source side matches with the corresponding destination side .
The previous snapshot shows different responses under grid “Match Class” & “Match Type “ upon selecting “vehicle_name” as matching field on the second import. Now the significance of these grids will be explained.
Match Class : Match class represents the status of the match of individual records on the source side with respect to the destination side with the field/fields in the “matching fields“ as reference .
In this case compare the source withrespect to the data imported in to the data manager it’s evident that when u compare the matching field (vehicle_name ) of the incoming records with the records of the data manger there are 5 records whose vehicle_name already occurs in the data manager and occurs once . They falls under the category “single” and match type “Exact” . The no .of records which match from the source side are displayed in the “active” grid .
There are 3 records which as per the matching field match with more than I record of the destination side. They fall under the category “Multiple” .
There are again 2 kinds of options provided in the “Matching Fields” tab in the Import Manager . The user may include more than one field in the “Matching Fields” tab and can have them as “individual” or “as combination” both of which in fact have different impact and response.
This options separately decides the fate of delta import.
In the fig below the “Vehicle_name” and “Onroad_price” are added individually.
Thus now the system will compare this two fields separately. It can be seen in the “Record matching” that the record which is conflicting have the value (Vehname) “pulsar 125” which is new when compared to the destination values but again the value(showroomprice)”53000” was already present in the data manger due to the first import.
The comparison of the values of the source against the destination side can also be viewed in the import manager in the “matching destination records” grid . As explained earlier since single means the record has one match against it ,thus that record from the destination side is displayed in the grid below and the value marked in red is the conflicting value and the value marked in green is the matching value as per the the fields provided in the “Matching fields list” .
Conflicting files must be imported because they might be just the update of the exisiting record . They can be parked for later imports as suggested by the Data stewards and can be updated by choosing the option “update mapped fields only”.
Now let’s consider the second case where the fields are combined and the change will be evident. Now the response can very well be compared. The files which were initially under “Single-Conflict” and “Multiple-Exact” have come under “None-None” because now the system is treating the two fields as a single record so even partial match will not do .Either Match Type status will be “None” or “Exact”.
The MDM consultant and the Data Steward have to design the Matching fields skillfully and strategically to take a decision on the DELTA during the consecutive loads.