Skip to Content
Technical Articles
Author's profile photo Jigang Zhang 张吉刚

about Chinese characters when generate file and download from AL11

It’s not convenient to use files generated at the server side then let the user download file from AL11 compared with generating files at local or get files as attachments from email. But sometimes has to use this method as email attachments could exceed the Maximum attachment size limit or time out issue when running front-end.

Here, small tips about generating/downloading files which contain Chinese characters from AL11:

Generate file at AL11 using ‘open dataset’

The below code has no code page specified, so it’s not specific to Chinese characters and could support all other characters as well. (Please check this and this for more details about AL11).

  open dataset gs_outfile for output in text mode encoding default.

Download file at AL11 using CG3Y

We have two options here which are ASCII format and Binary format. There’re many articles that tell the difference between Binary and ASCII like this and this one. Here just list few points from my own perspective:

  • Output file extension could be TXT or DAT. Please do not add file extensions like XLS, XLSX which can’t open by Excel or notepad.
  • BIN file is much larger than the ASCII file. For my example, it’s 800KB vs 280KB. After converting the BIN/ASCII formatted file from DAT/TXT to EXCEL, it’s 307KB vs149KB.
  • Both BIN and ASCII can display Chinese characters correctly when opening with a notepad. But ASCII could be truncated if lots of columns! So always use BIN instead of ASCII for download.
  • It’s not a good approach to open the converted file by excel directly. Instead, try using Open->new file, Excel will popup text import wizard, and suggest 65001(UTF-8) by default.

  • There’re two types of errors when displaying Chinese characters after converting to DAT/TXT file(other English text and numbers are showing correctly, not a messy file): one is show # for all Chinese characters, another one is showing messy code for all Chinese characters.
  • # means the file has been generated incorrectly, check the file generation logic. Messy code could be a conversion issue, check the convert logic or convert procedure after generation. Notepad provides the function to save a file as <UTF-8 with BOM>. Just have a try if have a second type error.
  • Besides Chinese characters, it could be the same issue for Japanese characters or any language using Double Byte Character Sets (DBCS) as well. If using the program to download those files from AL11 by some FM, please consider the code page accordingly.

 

Assigned Tags

      4 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Sandra Rossi
      Sandra Rossi

      In Wikipedia, they call it "mojibake".

      Your link doesn't work at "Please check this for more details about AL11".

      The below code works only if the characters are encoded using UTF-8 (provided that you have Unicode SAP system/systems 7.50 or later) - see ABAP doc for more information about default meaning:

       open dataset gs_outfile for output in text mode encoding default.

      CG3Y/CG3Z are only available on SAP ERP (R/3, S/4HANA, etc.)

      I didn't get your point about "BIN file is much larger than the ASCII file." BIN means the file is downloaded without conversion, the size on your laptop is not different from original. With ASC, all depends on the code page on your laptop (may depend on a setting on SAP GUI and also on Windows). If you have a text file with occidental characters and go from UTF-8 in SAP to old Occidental format on your laptop, the size may be almost identical on your laptop. If you have a file with Chinese characters and your laptop has a Chinese code page, it may go down to 2 bytes per character. To go from 800KB to 280KB, I'm not sure what your exact situation can be.

      Author's profile photo Jigang Zhang 张吉刚
      Jigang Zhang 张吉刚
      Blog Post Author

      Sandra Rossi

      Good day and thanks for the comments.

      1. 'mojibake', I learn a new word, thanks : P
      2. Oops, I update the broken linkage now;
      3. "BIN file is much larger than the ASCII file." Here, for the same file generated (by open dataset) at AL11, download it as the BIN format file is larger than the ASC format file.
      4. For my test case, file size from 800KB to 280KB which caused by column truncated at ASCII file. The number of rows is the same for BIN and ASC, but ASC has fewer columns (19) than BIN which has full columns (49).
      5. There's so much underlying knowledge I don't know, thanks for your time.
      Author's profile photo Sandra Rossi
      Sandra Rossi

      3 and 4. Yes it's a "bug" in both CG3Y and CG3Z, in ASC mode, they truncate lines at 256 characters, so it's very "dangerous" to use them, and it explains what you experienced. So I understand the meaning of "BIN file is much larger than ASC file" but that misleads people if you don't say it's a bug, and they should only use BIN. Note that CG3Y and CG3Z are for specific usage, they are not intended for general usage.

       

      Author's profile photo Jigang Zhang 张吉刚
      Jigang Zhang 张吉刚
      Blog Post Author

      Sandra Rossi

      Thanks for letting me know 🙂