Skip to Content

I have created a POC to demonstrate the below said scenario (Concatenated Address) using Match transform and the details are given below.

The Address columns were concatenated (Address1+Address2+City+Region+PostCode) and passed on to match transform. Fuzzy logic Settings (Remove punctuation, Convert to uppercase, Check for transposed letters) were configured appropriately.

Sample Data:

FirmName

Address1

Address2

City

Region

PostCode

Country

SAMSUNG

250 2-GA TAEPYONG-RO JUNG-GU

SEOUL

100-742

KR

SAMSUNG Corp

250 2-GA TAEPYONG-RO JUNG-GU

SEOUL

100-742

KR

SAMSUNG India

250 2-GA TAEPYONG-RO JUNG-GU

SEOUL

1000-742

KR

SAMSUNG India Pvt Ltd

250 2-GA TAEPYONG-RO dummy

s

100-742

US

SAMSUNG Corporation

250 2-GA TAEPYONG-RO test

SOUL

100-742

UK

Concatenated Address:

(DQ_Firm_Modified.Address1||’ ‘||DQ_Firm_Modified.Address2||’ ‘||DQ_Firm_Modified.City||’ ‘||DQ_Firm_Modified.Region||’ ‘||DQ_Firm_Modified.PostCode)

Concatenated Address

250 2-GA TAEPYONG-RO JUNG-GU  SEOUL  1000-742

250 2-GA TAEPYONG-RO JUNG-GU  SEOUL  100-742

250 2-GA TAEPYONG-RO dummy  s  100-742

250 2-GA TAEPYONG-RO test  SOUL  100-742

250 2-GA TAEPYONG-RO JUNG-GU  SEOUL  100-742

Match Criteria:

Match_Transform.png

Match Transform Results:

ADDRESS_PRIMARY_NAME

ADDR_ADDRESS_GROUP_NUMBER

ADDR_ADDRESS_MATCH_STATUS

ADDR_ADDRESS_MATCH_LEVEL

ADDR_ADDRESS_MATCH_CRITERION

ADDR_ADDRESS_MATCH_SCORE

ADDR_ADDRESS_MATCH_TYPE

ADDR_ADDRESS_ADDRGROUPSTATS_GROUP_COUNT

ADDR_ADDRESS_ADDRGROUPSTATS_GROUP_ORDER

ADDR_ADDRESS_ADDRGROUPSTATS_GROUP_RANK

250 2-GA TAEPYONG-RO JUNG-GU  SEOUL  1000-742

1

D

Address

D

5

1

M

250 2-GA TAEPYONG-RO JUNG-GU  SEOUL  100-742

1

P

Address

100

W

5

2

S

250 2-GA TAEPYONG-RO dummy  s  100-742

1

P

Address

100

W

5

3

S

250 2-GA TAEPYONG-RO test  SOUL  100-742

1

P

Address

100

W

5

4

S

250 2-GA TAEPYONG-RO JUNG-GU  SEOUL  100-742

1

P

Address

100

W

5

5

S

By looking at the Match Score and Match Group Number, we can make out that fuzzy logic works fine here in this scenario. Also note that I have not used Firm Name in any criteria for comparison.

But when I completely mess up the data just like below, the transform is unable to find the match. We have to be very cautious about this and only cleansed data should be given as Input to this transform.

FirmName

Address1

Address2

City

Region

PostCode

Country

SAMSUNG Corp

2510 2-GA TAEdddPYONG-RO JUNGrr-GU

SEooOUL

1010-7420

KRA

SAMSUNG India

2550 2-11GA TAEaaaPYONG-RO JUsNGssd-G2U

SEOUuuLa

10100-742

KR

SAMSUNG India Pvt Ltd

2501 2-GA TAEPYONG-RO dummy

s

10001-742

US

SAMSUNG Corporation

2505 2-GA TAEPYONG-RO test

SOUL

100-742

UK

To report this post you need to login first.

1 Comment

You must be Logged on to comment or reply to a post.

Leave a Reply