In many ABAP developments, we use CSV files and sometimes there is confusion regarding CSV itself. Is it just a text file with values separated by commas ?

Let’s look at semantics of the various components involved so we have a vocabulary to work with.

Separator: Demarcates between two fields – so it will be a comma ‘,’ for CSVs.

Delimiter: It signifies limits of things, where it begins and ends. e.g. “Test String” has two delimiters, both double quote characters. Many CSVs can have double quotes as delimiters when comma values are to be placed as text.

Terminator : Indicates end of sequence. For CSV, we can think of newline as the terminator.

So if we have confusion about whether commas or double quotes are allowed inside data, looking at the CSV specification:

The de facto standard for CSV is here in case you want to read the full standard.

http://tools.ietf.org/html/rfc4180

Definition of the CSV Format ( points 1 – 7 are from CSV standard )

  1.  Each record is located on a separate line, delimited by a line break.

  2. The last record in the file may or may not have an ending line break.

  3.  There maybe an optional header line appearing as the first line of the file with the same format as normal record lines.

  4.  Within the header and each record, there may be one or more fields, separated by commas.  Each line should contain the same

      number of fields throughout the file.  Spaces are considered part of a field and should not be ignored.  The last field in the

      record must not be followed by a comma.

 

  5.  Each field may or may not be enclosed in double quotes (however some programs, such as Microsoft Excel, do not use double quotes

      at all).  If fields are not enclosed with double quotes, then double quotes may not appear inside the fields.

 

  6.  Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes.

  7. If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with

       another double quote.

In my experience, point 7 is where we get tripped the most. CSV stands as comma separated values leading to the impression that commas are the separator and given that excel doesn’t put commas, it can start to get confusing.


So looking at some examples

Basic Example:

10, Vikas , Sydney

Data with separator / delimiter inside them.

“11”, “Vikas”, “Sydney, AU”      <– Data containing comma

“12”, “Vikas”, “Sydney, “NSW” AU”   <– Data containing comma and quotes in data

Handling in ABAP:

I’m focusing on reading the files as that’s where we face issues. The file can be uploaded from user’s desktop or read from the application server.

1)  Write your own code:

This can be easiest to start with but can start to get complicated with time.

Get data as a string, split at comma.

   split lv_data at ‘,’ into lw_struct-test1 lw_struct-test2 lw_struct-test3.

Drawbacks:

a) This won’t work if we have data with separator, terminator or delimiter ( so no commas, double-quotes or newline within data ).

b)  The code will need to be updated if the file format changes – say we need to add another field test4. The code then changes to :

   split lv_data at ‘,’ into lw_struct-test1 lw_struct-test2 lw_struct-test3 lw_struct-test4.

2) Read the file using KCD_CSV_FILE_TO_INTERN_CONVERT

CALL FUNCTION ‘KCD_CSV_FILE_TO_INTERN_CONVERT’

  EXPORTING

    i_filename      = ‘C:\Temp\Upload.csv’

    i_separator     = ‘,’

  TABLES

    e_intern        = gt_intern

  EXCEPTIONS

    upload_csv      = 1

    upload_filetype = 2.

Drawbacks

a) The file can be read only from presentation server/ desktop.

b) If a CSV file exists with double quotes, the last field is left with double quotes.

c) In case the file is to be read from application server, we need to read the code inside this FM and write some custom logic.

3) Use RTTI and dynamic programming along with FM RSDS_CONVERT_CSV .


It works but has lots of code . You can have a look at the code in this GIST.

CSV_Upload_long_process

In summary the steps are :

– Get structre of destination table using RTTI

– Create field catalog

– Create a dynamic table for field catalog values

– Create dynamic table lines

– Process Raw CSV data

– Store CSV files into dynamic table

Drawback:

a) Relatively long code leading to results especially if you have to program it from scratch.

Advantage:

a) Code is free from the target table format. If a new field is to be added, just update the structure for table type z_data_tty

4) Use class CL_RSDA_CSV_CONVERTER .

So the call becomes very straight forward – Instantiate the class with the separator and delimiter values. For a normal CSV, leave them as default .

* Instantiate the CSV object

  call method cl_rsda_csv_converter=>create

*  EXPORTING

*    i_delimiter = C_DEFAULT_DELIMITER

*    i_separator = C_DEFAULT_SEPARATOR

    receiving

      r_r_conv    = lo_csv

* Process records

  loop at lt_upload_data into lv_data.

    CALL METHOD lo_csv->csv_to_structure

      EXPORTING

        i_data   = lv_data

      IMPORTING

        e_s_data = lw_struct.

That’s It !

Advantages:

a) The code is very small – less chances of us making an error compared to the one in 3) above.

b) The code is decoupled with file structure – we get to keep the benefit from the above point .

c) It can be used for both application server / presentation server files – of course file reading will need to be done before the call.

d) The developer has documented the examples exhaustively in method CSV_TO_STRUCTURE . Big thank to him/her !

e) It’s part of package RSDA which is present in ABAP trial environments as well such as NSP .

If you feel lazy to type the full program, here is the whole source code:

CSV_Reading_option_4

To report this post you need to login first.

11 Comments

You must be Logged on to comment or reply to a post.

  1. Jan Hempel

    Thanks for sharing.

    For converting CSV to internal table I often use TEXT_CONVERT_CSV_TO_SAP which is also pretty straight forward (raw data in, structured data out).

    Regards

    Jan

    (0) 
  2. Eng Swee Yeoh

    Hi Vikas

    Nice blog detailing the comparison between different approaches to reading CSV files in ABAP.

    I just completed a requirement using option 3, but will definitely try out option 4 next time I have a similar requirement.

    However, my approach to option 3 was not as lengthy. There was no need to create field catalogs and dynamic table. I just created a data reference LIKE LINE of the output table. Below are snippets of my code.

    /wp-content/uploads/2014/10/csv_566768.png

    /wp-content/uploads/2014/10/csv2_566749.png

    Rgds

    Eng Swee

    (0) 
      1. Eng Swee Yeoh

        Hi Bruno

        If I remember correctly, it did handle date and amount fields correctly. Unfortunately, I’ve moved to a different project and no longer have access to that system to verify.

        I do think it is handled automatically by the ABAP engine in line 54 of my screenshot above.

        ABAP automatically performs type conversion when the assigning values from one field to another. So, line_content (which is CLIKE) will go through conversion to whatever the current type for <ddic_column> is, i.e. date type, amount type.


        Do give it a try and let me know how it goes.


        Rgds

        Eng Swee

        (0) 
  3. Matthew Billingham

    1) Parts of this blog as copied from the linked standard for CSV. You must make it clear which parts are being quoted. (Or I’ll have to reject the blog, and I actually don’t want to do that).

    2) While CSV has a standard definition, this is often not followed. For example, from Excel, as CSV file may well be semi-colon (;) separated. You should edit your blog to reflect this. Also, whether it affects any of the solutions you’ve given.

    3) End of line markers may be Line Feed, but may also be Carriage Return, Line Feed. It depends on the source system and the program used to generate the file.

    (1) 
  4. Alexey Sadovoy

    Disadvange  (error) of method 4 – if needs structure contain field ‘string’ typed and CSV containts double quoted text – exception ocurs. 🙁

    (0) 
  5. Juwin Pallipat Thomas

    Parsing CSV – this is not a problem just for ABAP language. Hence, rather than just thinking of an ABAP solution, think out of the box. Have a look at this post: http://www.kimgentes.com/worshiptech-web-tools-page/2008/10/14/regex-pattern-for-parsing-csv-files-with-embedded-commas-dou.html

    This is the easiest method I have found till now. Just use SPLIT command, with the REGEX from this blog. I have used it several successfully.

    Thanks,

    Juwin

    (0) 

Leave a Reply