Skip to Content

How to avoid wrong splitting of .CSV file into internal table using MATCH OFFSET, FIND

While working on the .CSV (comma separated file) on application server.

As the file is a comma separated file, we split the file at every comma in to our internal table.

The file can contain numbers as well as some text/description field e.g.

000000,00000,00000,”abcdefg”,”hijklm”,”kljudyhs”,….etc.

But sometimes text or description contain ‘,’ in it as part of description and it could lead wrong splitting of file E.g.

000000,00000,00000,”abc,defg”,”hij,klm”,”kljudyhs”,….etc.

In such cases we can use MATCH OFFSET, FIND to replace the comma from description field with some other character so that it will not lead to wrong splitting of file.

  1. E.g.

Consider our .CSV file on application server contains below data and we have to split the file into internal table at every comma.

000000,00000,00000,”abc,defg”,”hij,klm”,”kljudyhs”,….etc.

TYPES : BEGIN OF ly_file,

          line TYPE string,
        
END OF ly_file.

     DATA : lv_file TYPE text200,  “Will contain file name
           lw_file
TYPE ly_file,  “ work area to get records from file

     lv_line TYPE string,   “string which will use to get the record
     lt_result_tab
TYPE match_result_tab, ”contain occurrences of “
     lv_off1
TYPE i,        “ will store offset
     lv_index
TYPE sy-index,
     lv_len
TYPE i.
 
FIELD-SYMBOLS <match> LIKE LINE OF lt_result_tab.
 
FIELD-SYMBOLS <match1> LIKE LINE OF lt_result_tab.

*Get the file location and name in lv_file.

Lv_file = location of file to read

*Read the file from Application serve

OPEN DATASET lv_file FOR INPUT IN TEXT MODE ENCODING DEFAULT.

IF sy-subrc = 0.      DO.
       
READ DATASET lv_file INTO lw_file-line.
       
IF sy-subrc = 0.

* Take the file record in a string lv_line.

                          lv_line = lw_file-line.

* Find the text/description fields i.e search for double quotes which contains text/description field- ”abc,defg”.


         
FIND ALL OCCURRENCES OF ‘’ IN lv_line RESULTS lt_result_tab.
         
IF lt_result_tab IS NOT INITIAL.

* Result tab gives offset or location of every “ available in the

*string .In our case 000000,00000,00000,”abc,defg”,”hij,klm”,”kljudyhs”,….

*The first “ is at index 17 and the second is at *index 26, which forms *the first text field. The third “ starts at *index 27 and fourth at *index 35, which forms the second text/description field.

*Now we have to find the first and second offset of “ to get the first *text field. Then third and fourth for second text filed and so on

       Clear lv_index.

           LOOP AT lt_result_tab ASSIGNING <match>.
                CLEAR: lv_off1, lv_len.

     * This check will avoid reading the same offset again.             

               IF lv_index NE sy-tabix.

    * Get the first offset of “
                lv_off1 = <match>-offset.

    * Increase index to read the next offset
                lv_index = sy-tabix +
1.    * Read next offset using lv_index
           
READ TABLE lt_result_tab ASSIGNING <match1> INDEX lv_index.                IF sy-subrc = 0.* Get the second offset and then calculate the length of the text using    * both the offsets. The second offset will be always greater value so *by subtracting the first offset from second offset, we will get the *length of text.

                 lv_len = <match1>-offset – lv_off1.
               
ENDIF.* Replace occurrences comma ‘,’ with dot ‘.’ in the string from the *first offset to the length of string
         
REPLACE ALL OCCURRENCES OF ‘,’
         
IN SECTION OFFSET lv_off1 LENGTH lv_len OF lv_line WITH ‘.’.
         
ENDIF.
       ENDLOOP.
      ENDIF.

* Now split the lv_line into your internal table at every comma

SPLIT lv_line AT ‘,’

INTO your work area

______________________________Thank You___________________________________

To report this post you need to login first.

Be the first to leave a comment

You must be Logged on to comment or reply to a post.

Leave a Reply