Skip to Content
Technical Articles
Author's profile photo Kamal Kumar

How to make Data Consistent in Data Service 4.X using UDT

I am explaining with inconsistent  Mobile Number data but you can use it for inconsistent Zip code also.


As  we already discussed how to use User_Defined Transform in

Please follow the same steps and write the following code in Python Editor


Input:-  Mobile_No to get Field value in Transform.

Output:- STD_Mobile_No to Return the updated Value.

Local Variable:- Mobile , STD_Mobile



import re

Mobile = record.GetField(u’Mobile_No’)

STD_Mobile = re.sub(“[^0-9]”,””,Mobile)

STD_Mobile = STD_Mobile[-10:]


del Mobile

del STD_Mobile

Importing Package Regular Expression

#getting Mobile Number into local variable

It will Replace Non Digit Char with BLANK””

It will take 10 digit from the right side

#update the output fields STANDARD Mobile Number

Initializing the variables

QUERY Transform:-

BASE_USER_DEFINED Transform  will give us consistent and clean data, now we have to convert it into XXXX-XXX-XXX

PATTERN using Query Transform.

Mapping Code:-

substr( Base_UserDefined.STD_Mobile_No,0,4)||’-‘||       #will give XXXX-

substr( Base_UserDefined.STD_Mobile_No,5,3)||’-‘||       #will give XXX-  

substr( Base_UserDefined.STD_Mobile_No,8,3)            #will give XXX 


Complete Process:-



We can use the combination of User_defined Transform and Custom function for different country’s ZIP CODE .

If possible then please Give the Rating and Feedback so that i can continue……

Thanks & Regards,


Utopia India Pvt. Ltd.


Assigned Tags

      You must be Logged on to comment or reply to a post.
      Author's profile photo Former Member
      Former Member

      Can you explain what the advantage to doing this in a Python User Defined transform is? Why wouldn't you do the same thing in a DS script function? Is speed significantly different? Memory usage?

      Author's profile photo kamal kumar
      kamal kumar
      Blog Post Author

      Hi Ernie,

      User defined transform provide us more functionality of Python which we can't use in script function. In my example i imported RE package and used re.sub(search replace using Regular expression) but in script function we can't import this package and it's functions also.



      Author's profile photo Former Member
      Former Member

      That doesn't explain what you can do different. I can import database functions to do trims inside DS as well, but that function exists inside DS, so why would I want to? Just because you CAN do something does not make it the best solution. What is the advantage that Python is giving here? Or if this is just meant as a show case, what other functions could we perform that are NOT already available in DS?

      Author's profile photo kamal kumar
      kamal kumar
      Blog Post Author


      Can you please provide me the function name or syntax ,which you will prefer in this scenario.

      I will be very thankful to get better solution... 🙂


      Author's profile photo Former Member
      Former Member

      I don't have a preference. I am trying to figure out the justification to tell someone to write something in Python. If there are good reasons, lets work them out and support them with good arguments. I am NOT against Python. I am against learning a new tool or putting code into another location when the existing tool or location is functionally the same. That is the better solution for me.

      Since you asked for an alternative, the below function in DS script accomplishes the same purpose. It runs in 5 seconds. I don't think this set is a realistic test for either tool, but it's what I had handy.

      $L_Len = length($Mobile_In);

      $L_Loop = 1;

      $L_Mobile_Out = '';

      while ($L_Loop <= $L_Len)


            $L_One_Char = substr($Mobile_in, $L_Loop, 1);

            if ($L_One_Char >= '0' and $L_One_Char <= '9')

               $L_Mobile_Out = $L_Mobile_Out || $L_One_Char;

            $L_Loop = $L_Loop + 1;


      $L_Len = length($L_Mobile_Out);

      $L_Mobile_Out = substr($L_Mobile_Out, $L_Len-9, 10);

      Return substr($L_Mobile_Out,1,4) || '-' || substr($L_Mobile_Out,5,3) || '-' || substr($L_Mobile_Out,8,3);

      Author's profile photo kamal kumar
      kamal kumar
      Blog Post Author


      I am completely agree with your view.

      I was also thinking the same before using UDT.

      I am also looking for more scenario where we can use UDT instead of script and as i'll get, i'll share .

      But lets see current example....

      For removing all character except Digit we have to write 10-15 line code with substring and Script..

      and think 1). if we want to keep Alphachar also so we have to add additional condition

      if (($L_One_Char >= '0' and $L_One_Char <= '9')or ($L_One_Char >= 'a' and $L_One_Char <= 'z')or ($L_One_Char >= 'A' and $L_One_Char <= 'Z'))

      And one More 2).if we have four to five fields in 1st field we want only Alpha ,in 2nd we want to keep digit and in third we want to keep Alpha numeric then we have to use complex loops and if statements or you have to build different functions. 

      it will be little complex for End user.

      But in UDT you can use 1-2 line for this

      eg:-For digit only re.sub("[^0-9]","",Mobile)

      For Alpha Numeric re.sub("[^0-9a-zA-Z]","",EmpIDl)

      And as I mentioned in the topic itself An Easy way ,I didn't mention the Best Way.... 🙂

      But Ernie ,truly i am also looking for more scenario where we can use UDT and sure i'll contact Wernie also for this.


      Author's profile photo Former Member
      Former Member

      For current example, you used 7 lines of code. I used 12. I would need no additional lines of code to include alpha, just additional conditions in the IF. It is no more complicated for someone else to use, they still call my function once, just as they do yours. For mine, I did not have to learn or know Python, just DS script which I already had to know.

      I really would be interested to see cases where having Python is going to save me large amounts of processing time or coding. But since I wrote that script in less than an hour I don't really see that in this case. If you run across some, please post a link here and I will happily investigate.

      - E