Skip to Content
Author's profile photo Raghu Vamseedhar Reddy Kadipi Reddy

Handling Code page, Character encoding in SAP PI / PO

    As a middle-ware SAP PI / PO integrates SAP / non-SAP systems, which use different formats (text(XML, CSV…), binary) to represent data. Sometimes they even encode text in different formats OR use different code-pages. This document helps to understand and handle those situations.

    Code-page is a table, assigning a number for each character. Example ‘A’ is 65, ‘a’ is 97 and ‘b’ is 98 and so on.

          Click on image to expand. HTML form of below screenshots are attached (please rename .txt to .html). ASCII, ISO 8859-1, CP-1252 and Unicode.

0 Code page.PNG
Code page.gif Unicode.PNG

    ‘A’ is 65. 65 = 10 0001 (64*1 32*0 16*0 8*0 4*0 2*0 1*1). Representing code-page number in 0’s and 1’s is encoding.

    10 0001 is 65. Lookup 65 in code-page, it is ‘A’. Looking up code-page number is decoding.

    Some encodings are fixed length. Example ASCII, ISO 8859-1, cp1252, UTF-32 and ISO 8859-1 and cp1252 have to use 1 byte to represent code-page number. ASCII has to use 1 byte (it actually use only 7 bites, 1st bit is ignored). UTF-32 has to use 4 bytes.

    Some encodings are variable length. Example UTF-8 and UTF-16. UTF-8 will start with 1 byte, if code-page number is too big to be represented in 1 byte, it can use 2 or 3 or 4 bytes. UTF-16 will start with 2 bytes, if needed it will use 4 bytes (i.e., 2 bytes or 4 bytes).

UTF-8: – UTF-8 is the preferred encoding on internet. HTML, XML, JSON … are encoded in UTF-8 by default.

Understand UTF-8, BOM, endian. FYI..Characters, Symbols and the Unicode Miracle – Computerphile – YouTube, Characters in a computer – Unicode Tutorial UTF-8 – YouTube

    Byte Order Mark (BOM):- It’s a heads-up notice to target system about encoding. Some Microsoft Windows applications require BOM to properly decode UTF text. This is how BOM works. If we are sending UTF-8 encoded text, then we prefix that text stream with binary form of EF BB BF (hex). Then target system reads these characters and understands “This text stream starts with EF BB BF, then this text must be UTF-8 and I should use UTF-8 decode logic”. It will not display EF BB BF. If we are sending UTF-16 Big-Endian, then we will prefix that text stream with FE FF (hex). Then target system reads these characters and understands “This text stream starts with FE FF, then this text must be UTF-16 BE”.

    If target program does not understand BOM heads-up notice, i.e., when it sees EF BB BF (hex) at starting of text stream and it is not programmed to understand it. It may interpret it as cp1252 characters . If you see any error or display starting with  OR þÿ OR ÿþ. It means that, target program is not decoding data properly.

                                                                Click on image to expand.

BOM.gif

    To test whether source, PI/PO and target system are using proper encoding or not. You can request source system to send Euro sign € in one of data elements. If target system does not decode € properly, then there is issue with code-page / encoding.

Notepad.gif

Why Euro sign € is displayed as €?

€ -> U+20AC (hex) -> 0010 0000 1010 1100 -> 11100010 10000010 10101100 -> E2 82 AC -> €

Please go through How to Work with Character Encodings in Process Integration.

Here are some points to note from above document.

    When reading XML, SAP recommend to “File Type” as ‘Binary’. As XML prolog has encoding details <?xml version=”1.0″ encoding=”utf-8″?>. SAP note 821267.

    You can use below adapter modules to change encoding.

    MessageTransformationBean: Transfer.ContentType = text/xml;charset=”cp1252″

    TextCodepageConvertionBean: Conversion.charset = “utf-8”

    XMLAnonymizerBean: anonymizer.encoding = “utf-8”

    FYI. cp1252 is superset to ASCII and ISO 8859-1. UTF-8 is superset of cp1252, but number of bytes used may vary.

Lets handle issues mentioned section 5 and 6 in How to Work with Character Encodings in Process Integration.

1) Java mapping to change code-page/encoding. Supported Encodings.


package com.map;
import com.sap.aii.mapping.api.*;
import java.io.*;
public class ChangeEncoding_JavaMapping extends AbstractTransformation {
    @Override
    public void transform(TransformationInput transformationInput, TransformationOutput transformationOutput) throws StreamTransformationException {
        try {
            InputStream inputStream = transformationInput.getInputPayload().getInputStream();
            OutputStream outputStream = transformationOutput.getOutputPayload().getOutputStream();
            //Read input as cp1252 and write output as UTF-8.
            byte[] b = new byte[inputStream.available()];
            inputStream.read(b);
            String inS = new String(b, "Cp1252");
            outputStream.write(inS.getBytes("UTF-8"));
        } catch (Exception ex) {
            getTrace().addDebugMessage(ex.getMessage());
            throw new StreamTransformationException(ex.toString());
        }
    }
}

















Result: –

1JavaMapping.PNG

2) Java mapping to handle Quoted-Printable input.


package com.map;
import com.sap.aii.mapping.api.*;
import java.io.*;
public class QuotedPrintable_JavaMapping extends AbstractTransformation {
    @Override
    public void transform(TransformationInput transformationInput, TransformationOutput transformationOutput) throws StreamTransformationException {
        try {
            InputStream inputStream = transformationInput.getInputPayload().getInputStream();
            OutputStream outputStream = transformationOutput.getOutputPayload().getOutputStream();
            //Convert quoted-printable to unicode output. Add JAX-WS library when compiling.
            inputStream = javax.mail.internet.MimeUtility.decode(inputStream, "quoted-printable");
            //Copy Input content to Output content.
            byte[] b = new byte[inputStream.available()];
            inputStream.read(b);
            outputStream.write(b);
        } catch (Exception ex) {
            getTrace().addDebugMessage(ex.getMessage());
            throw new StreamTransformationException(ex.toString());
        }
    }
}

















Result: –

2JavaMapping.PNG

3) Java mapping to handle Base64 input.


package com.map;
import com.sap.aii.mapping.api.*;
import java.io.*;
public class Base64_JavaMapping extends AbstractTransformation {
    @Override
    public void transform(TransformationInput transformationInput, TransformationOutput transformationOutput) throws StreamTransformationException {
        try {
            InputStream inputStream = transformationInput.getInputPayload().getInputStream();
            OutputStream outputStream = transformationOutput.getOutputPayload().getOutputStream();
            //Decode Base64 Input content to Output content. FYI. Java 8 has java.util.Base64.
            byte[] b = new sun.misc.BASE64Decoder().decodeBuffer(inputStream);
          //Above class is internal class. As an alternative you can use below line, whichever works for you.
          //byte[] b = javax.xml.blind.DatatypeConverter().decodeBuffer(inputStream);  
          outputStream.write(b);
        } catch (Exception ex) {
            getTrace().addDebugMessage(ex.getMessage());
            throw new StreamTransformationException(ex.toString());
        }
    }
}

















Result: –

3JavaMapping.PNG

4) Java mapping to add BOM.


package com.map;
import com.sap.aii.mapping.api.*;
import java.io.*;
public class BOM_JavaMapping extends AbstractTransformation {
    @Override
    public void transform(TransformationInput transformationInput, TransformationOutput transformationOutput) throws StreamTransformationException {
        try {
            InputStream inputStream = transformationInput.getInputPayload().getInputStream();
            OutputStream outputStream = transformationOutput.getOutputPayload().getOutputStream();
            //Copy Input content to Output content.
            byte[] b = new byte[inputStream.available()];
            inputStream.read(b);
            //Prefix BOM. For UTF-8 use "0xEF,0xBB,0xBF". For UTF-16BE use "0xFE,0xFF". For UTF-16LE use "0xFF,0xFE".
            outputStream.write(0xEF);  outputStream.write(0xBB);  outputStream.write(0xBF);
            outputStream.write(b);
        } catch (Exception ex) {
            getTrace().addDebugMessage(ex.getMessage());
            throw new StreamTransformationException(ex.toString());
        }
    }
}

















Result: – BOM characters will not be displayed.

4JavaMapping.PNG

5) Java mapping to handle XML Escape Sequence.

Not well-formed XML – & issue

FYI…How to create Java mapping.

How to create Java Mapping in SAP PI / PO

Assigned Tags

      16 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Eng Swee Yeoh
      Eng Swee Yeoh

      Hi Raghu

      Thanks for sharing this 😉

      A couple of feedback:

      - Can you provide links in a reference section to the different code-pages shown in the animated GIF? It's a bit hard if anyone wants to view the details of a particular codepage as the animation keeps on switching images.

      - It is not recommended to use the Sun Base64 decoder. Neither can we use Java 8's Base64 library (yet!) as the latest PI versions are still using JVM 6.1. Refer to the following thread for the different options for handling Base64.

      Base64 Encoding using UDF

      Rgds

      Eng Swee

      Author's profile photo Raghu Vamseedhar Reddy KadipiReddy
      Raghu Vamseedhar Reddy KadipiReddy
      Blog Post Author

      Eng,

      Thank you.

        - I have used GIF to explain the concept better and save space. I would recommend taking a screenshot, when constant picture is required. Code-pages are available on wiki, ASCII, ISO 8859-1, CP-1252, Unicode.

        - I agree, sun.misc.BASE64Decoder is internal class.

      Readers can use byte[] b = javax.xml.blind.DatatypeConverter().decodeBuffer(inputStream); OR byte[] b = new  byte[inputStream.available()]; inputSteam.read(b); b = javax.xml.bind.DatatypeConverter.parseBase64Binary(b.toString());

      FYI..this option did not gave me correct result.

      Document is updated with your suggestions.

      Author's profile photo Eng Swee Yeoh
      Eng Swee Yeoh

      Hi Raghu

      Thanks for the update. Yes, I understand that it is possible to take a screenshot, but it would have been nicer to have the links - just my two cents 😉

      Hmm.. not sure why javax.xml.bind.DatatypeConverter.parseBase64Binary is not giving you the correct results. I do have some interfaces running on PO7.4 using both the parse and print methods to handle Base64 decode/encode and they are working fine. Anyway, there are a few options around so readers can choose whichever works on their system 😉

      Rgds

      Eng Swee

      Author's profile photo Raghu Vamseedhar Reddy KadipiReddy
      Raghu Vamseedhar Reddy KadipiReddy
      Blog Post Author

      Eng,

      HTML files (I had prepared) which were used to for screenshots are attached for your reference . Please rename .txt to .html.

      Author's profile photo Ameer Khan
      Ameer Khan

      Hi Raghu,

      Very nice blog.

      I have the similar scenario, where the target system wants the file as UTF-8 with BOM encoding.

      I had used the JAVA mapping as suggested and the encoding works fine in the Operation Mapping.

      But after this there is a FCC which will convert the XML to Plain and in this way the BOM encoding is missed out.

      I can think about a solution of generating a flat file in a temporary folder and separate channel will poll this file and I can use your JAVA mapping to add these 3 characters and send the file to the target system.

      or

      Can you help me if there can be another quick solution for this.

      I can see all experts has commented on this blog, so will be of real help if anyone can help me with this.

      Thanks

      Ameer.

      Author's profile photo Dipen Pandya
      Dipen Pandya

      Hi Raghu,

      Thanks for sharing.

      Nicely explained.

      BR,

      Dipen.

      Author's profile photo Former Member
      Former Member

      Hi Raghu,

      Thanks for Sharing such a useful information.

      Currently I got the same requirement, to add BOM to the file. Like you mentioned I did install NWDS and exported the jar file into ESR in PI. In first attempt I saw it was working good. I can see BOM added to the file. But later I tried the following day, I tested with one more file it was not working. I checked all the settings are normal. I deleted java mapping and created a new one with new names  and imported the same. .. but still BOM is not added to the file. Could you please help me on this.

      Thanks

      Praveen

      Author's profile photo Raghu Vamseedhar Reddy KadipiReddy
      Raghu Vamseedhar Reddy KadipiReddy
      Blog Post Author

      Praveen,

      Please flow How to create Java Mapping in SAP PI / PO and code given in 4th example in above blog.

      Author's profile photo Former Member
      Former Member

      Hi Raghu,

      I am getting below error in RECEIVER REST adapter audit logs, not sure what is causing this error. I believe it is some invalid character in the response. I am trying to do a JSON to XML conversion in response.


      Information Server returned code: 200

      MP: exception caught with cause java.lang.RuntimeException:

      com.ctc.wstx.exc.WstxIOException: Invalid white space character (0x8) in text to output (in xml 1.1, could output as a character entity)

      Exception caught by adapter framework: com.ctc.wstx.exc.WstxIOException: Invalid white space character (0x8) in text to output (in xml 1.1, could output as a character entity)

      Transmitting the message using connection JPR failed, due to: com.sap.engine.interfaces.messaging.api.exception.MessagingException: java.lang.RuntimeException: com.ctc.wstx.exc.WstxIOException: Invalid white space character (0x8) in text to output (in xml 1.1, could output as a character entity)

      As this message is getting stuck in receiver rest adapter on synchronous response side I believe I would either need to write a adapter module or change the code page to ISO Latin/ UTF -16 ?

      I tried to put the data format of response json to cp1252 but still same error.


      Thx

      Ravijeet

      Author's profile photo Shivduttsinh Mahida
      Shivduttsinh Mahida

      Hi Raghu,

       

      I have a requirement to generate a .txt file from PI encoded in utf-8 with BOM. I implemented the example 4 suggested here and added it in the operation mapping after my usual message mapping, but it still doesn't work. May you please let me know, what am i doing wrong here?

      Thanks

      Shiv

       

      Author's profile photo Ameer Khan
      Ameer Khan

      Hi Mahida,

       

      Even i am facing the same issue. Can you please let me know if you have solved the issue.

       

      Thanks

      Ameer Khan

      Author's profile photo Monica Gonzalez
      Monica Gonzalez

      Hi all,

       

      we are working with PO 7.5 and we need know all character encoding supported by this PO version.

       

      Please someone have updated documentation?

       

      Thanks and best regards

      Author's profile photo Maheshkumar gattu
      Maheshkumar gattu

      I have an issue when file has degree symbol like 30°   then it is converting to 30° like this. What needs to be done ?

       

       

      Thanks,

      Mahesh

       

      Author's profile photo Riccardo Miggiano
      Riccardo Miggiano

      Nice Blog, very useful

      Author's profile photo OILPI OILPI
      OILPI OILPI

      Hi,

      It was a very nice blog and thank you for preparing it to us.

      My scenario is a little bit different than yours. please give me some idea about my scenario.

      I have a requirement where I need convert the XML to ISO 8583 format.

      I have searched and found we can do conversion from XML to ISO-8859-1 using sap note 960663 .

      can i use the same process to convert the data into ISO 8583 ?

      Please help me with this.

      Author's profile photo Matthias Fricke
      Matthias Fricke

      This blog is of rare quality here on the SAP Net.

      Thanks for sharing!

      Matthias