Technical Articles
SAP DS – Split a Field at Separator into Multiple Rows
Salut mes amis!
Prelude :
I recently started my adventure in SAP DS ( 4.2) and found some great things in the tool . The most powerful being able to add Python code to do any kind of transformation . I read from many SAP DS experts that one should avoid using Python UDT transformation but in my use case I wanted to keep it very generic and Python came as a big rescue to make the code more flexible . One of the use case I will be discussing here , not the exact use case but a portion of it where I used Python Code . Hope this helps someone who is trying to achieve the same .
There are other options as you will see in the link below , but again I found python more flexible and without any dependency on xsd etc as in below post https://wiki.scn.sap.com/wiki/display/EIM/How+to+split+a+comma+separated+String+into+multiple+rows+via+XML+strings
Use Case :
Input looked like below :
Row_Id PartNo Qty Unit RefDes
1 Part1 3 EA Ref11,Ref12,Ref13
2 Part2 1 EA . <NULL>
3 Part3 5 EA Ref31,Ref32,Ref33,Ref34,Ref35
Expected Output :
Row_Id PartNo Qty Unit RefDes
1 Part1 3 EA Ref11
1 Part1 3 EA Ref12
1 Part1 3 EA Ref13
2 Part2 1 EA <NULL>
3 Part3 5 EA Ref31
3 Part3 5 EA Ref32
3 Part3 5 EA Ref33
3 Part3 5 EA Ref34
3 Part3 5 EA Ref35
Implementation :
In order to achieve above, implemented a UDT transform in SAP DS . In the Query transform feeding the UDT kept only 2 fields – Row_Id and RefDes . In the UDT Output , kept again 2 outputs Row_Id and RefDes , which will now have split Ref Des values . Finally joined the input data with output from this UDT with a left outer join on Row_Id
Heres a snapshot to get a better idea
UDT Definittion :
- Select Mode – Per Collection – As we ll be adding new rows to the input collection
- Launch Python Editor
- Added new field in I/O fields – For RefDes as the input Ref Des could be longer than 255 e.g. 8000 in my case and output will be 255(Restriction in UDT – cannot be more than 255)
Here are a few snapshots again to get a better idea:
Algorithm :
- Get the Whole collection or total number of records in the Input
- Loop through the number of records to do below
- Get Field value for Ref Des
- If the Ref Des value has separator in my case ‘,’ , split and add each split value as a new entry to the collection with Row Id and split Ref Des
- If the Ref Des value has no separator in my case ‘,’ , no need to split , so add as is as a new entry to the collection with Row Id and Ref Des value
- Free memory using DatManager.DeleteDataRecord and del
Code :
print(‘Start’)
newRecord = DataManager.NewDataRecord()
#gets the number of records
numRecords = Collection.Size()
num_rows = numRecords
#iterate over the collection
for recordNum in range(1, numRecords + 1):
#get a record
Collection.GetRecord(newRecord, recordNum)
#set a field on the record
rowID=newRecord.GetField(u’Row_Id’)
refDes=newRecord.GetField(u’RefDesignator’)
refDesSplitArr = refDes.split(“,”)
if len(refDesSplitArr) > 1:
for i in range(len(refDesSplitArr)):
num_rows=num_rows+i
newRecord2 = DataManager.NewDataRecord(num_rows)
newRecord2.SetField(u’Row_Id’, unicode(rowID))
newRecord2.SetField(u’Ref_Des’, unicode(refDesSplitArr[i]))
Collection.AddRecord(newRecord2)
else:
num_rows=num_rows+1
newRecord2 = DataManager.NewDataRecord(num_rows)
newRecord2.SetField(u’Row_Id’, unicode(rowID))
newRecord2.SetField(u’Ref_Des’, unicode(refDes))
Collection.AddRecord(newRecord2)
DataManager.DeleteDataRecord(newRecord)
del newRecord
print(‘End’)