GNU Tools for checking input files: using awk to c...
Technology Blogs by Members
Explore a vibrant mix of technical expertise, industry insights, and tech buzz in member blogs covering SAP products, technology, and events. Get in the mix!
But maybe you don't want to check the complete lines, but only check if certain field combinations ( e.g. key-fields!) appear more than once.
Lets say you have a file like this:
ABC;XYZ;MATNR;DBBD;LGORT;SOMETHIG_ELSE
12121;13213;MAT12;dfhsf;1000;sdfsdjhf
1sad21;13213;MAT12;dfhsf;1000;sdfsdjsadhf
12121;13213;MAT12;;1200;sdfsdjhf
121;13213;MAT45;;1200;sdfsdjhf
-> each line is clearly unique, however, if MATNR and LGORT are key fields, then we have a problem.
We can find out with the help of awk (I'm using gawk):
cat [filename] | gawk -F ; "{print $3, $5 }"
-> it reads the file, interpreting “;” as the field-separator (-F;), the prints the 3rd and 5th field ($3 $5), separated by the "output field separator" (OFS), which by default is space (,).
So the output in the example is:
MAT12 1000
MAT12 1000
MAT12 1200
MAT45 1200
-> as we now only have the key-values we wanted to compare, we can easily pipe it into the uniq -d we already know to see if there are any duplicates.
(and as this might be a lot of lines, we just count them with wc -l)