SAP Cloud Integration – Fixed Length Data Conversi...

akirwsky · ‎07-08-2023

Dear Reader,

This blog describes how to implement fixed length data conversion for full width characters such as Chinese and Japanese. Full-width characters are contained in two bytes and occupy a full-width square. There are some resources explaining how to convert characters with fixed length. And most of them are using XSLT Mapping. This solution did not work for my scenario with a mix of half and full-width chracters, and I realized this solution using XSLT mapping works fine when the data contains ONLY half-width characters. Therefore, I built my solution which works well with BOTH half-width and full-width characters. If you have any questions or suggestions to improve my solution, please kindly write in the comment. Thank you for reading in advance.

Scenario:

In my scenario, data should be sent to the target server with fixed length. And the data contains both half-width characters and full-width characters (Japanese Hiragana and Kanji). To explain why I had to use my solution, let’s compare the output data generated from the XSLT mapping solution with the one generated from my solution.

Field names and the fixed lengths.

	Field Name	Fixed Length (in bytes)
1	Number	8
2	Type	2
3	Date	8
4	Text	30
5	PIC	10
6	Location	6

Sample data

<Table>

<Record>

<Number>00000001</Number>

<Type>A</Type>

<Date>20230701</Date>

<Text>あいうえお</Text>

<PIC>Tim</PIC>

<Location>123456</Location>

</Record>

<Record>

<Number>00000002</Number>

<Type>B</Type>

<Date>20230701</Date>

<Text>漢字-FullWidth</Text>

<PIC>Lisa</PIC>

<Location>123456</Location>

</Record>

<Record>

<Number>00000003</Number>

<Type>C</Type>

<Date>20230701</Date>

<Text>FullWidthひらがな</Text>

<PIC>Mike</PIC>

<Location>123456</Location>

</Record>

   :

</Table>

Failure Case: XSLT Mapping

1. Input sample data

By using the content modifier, input the sample data.

2. XSLT mapping

You can refer to this resource how to create XSLT mapping file.

(Fixed Length File Generation Scenario Through SAP Cloud Platform Integration | SAP Blogs)

<?xml version="1.0" encoding="UTF-8"?>                                            

<xsl:stylesheet  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">                                            

<xsl:output omit-xml-declaration="yes" indent="yes" method="text"/>                                            

<xsl:strip-space elements="*"/>                                            



<xsl:template match="Record">                                            

<xsl:apply-templates />                                            

<xsl:text>                                            

</xsl:text>                                            

</xsl:template>



<xsl:template match="Number">                                            

<xsl:value-of                                            

      select="substring(concat(., '        '), 1, 8)"/>                                            

</xsl:template>      

<xsl:template match="Type">                                            

<xsl:value-of                                            

      select="substring(concat(., '  '), 1, 2)"/>                                            

</xsl:template>                                            

<xsl:template match="Date">                                            

<xsl:value-of                                            

      select="substring(concat(., '        '), 1, 8)"/>                                            

</xsl:template>                                            

<xsl:template match="Text">                                            

<xsl:value-of                                            

      select="substring(concat(., '                              '), 1, 30)"/> 

</xsl:template>                                            

<xsl:template match="PIC">                                            

<xsl:value-of                                            

      select="substring(concat(., '          '), 1, 10)"/>                                            

</xsl:template>

<xsl:template match="Location">                                            

<xsl:value-of                                            

      select="substring(concat(., '      '), 1, 6)"/>                                            

</xsl:template>

</xsl:stylesheet>

I tried with other character codes such as Shift-JIS and MS932.

Shift-JIS

<?xml version="1.0" encoding="Shift-JIS"?>

MS932

<?xml version="1.0" encoding="MS932"?>

3. Check the output data

As you can see, the first characters of each PIC field’s value are not aligned exactly. I used Sakura Editor to make it easy to count the bytes of characters. -> Releases · sakura-editor/sakura (github.com)

UTF-8

Shift-JIS

MS932

Successful Case:

This solution uses three content modifiers and a groovy script. These three content modifiers are for passing parameters to the groovy script. If you want to manage the parameters in the script, you can remove these three content modifiers. I implemented in this way with three content modifiers so that anyone who don't know how to code can just copy and paste my code.

1. Input sample data.

(Same as the previous way)

2. Set fields names in the properties, using content modifier.

You need to use the same field names and seuquence of the fields as the ones in the input data.

3. Set fields fixed length in the properties, using content modifier.

For each field, type fixed length. And use the same seuquence of the fields as the one in the previous step. For example, the 'length1' should be the length of the 'field1' property in the previous content modifier. Make sure to use the fixed length in bytes, but not in characters. Again, even though one half-width character and one full-width can be considered as one character, they have different length in bytes. One half-width character is one byte, but one full-width character is two bytes. Therefore, you always need to convert chracters with fixed length in the unit of bytes.

4. Set other parameters such as the record name and the number of fields.

Set a closure name of the record and the number of fields. In this demonstration, because the sample data has '<Record></Record>' for the closure name of the records, type 'Record' in the source value of 'RecordName'. And because we have 6 fields, type 6 in the source value of 'Number OfFields'.

5. Write groovy script to convert characters with fixed length.

Take all the headers and properties we prepared in the previous steps and use those to iterate over the records and fields properly. This code will pad each field value with spaces to convert it with the fixed length in the unit of bytes.

import com.sap.gateway.ip.core.customdev.util.Message;

import java.util.HashMap;

import java.lang.Math;



def Message processData(Message message) {

    //take a message body, properties, headers

    def body       = message.getBody(java.lang.String) as String;

    def properties = message.getProperties();

    def headers    = message.getHeaders();

    

    //Get the number of records

    String record_name        = headers.get("RecordName")

    String[] lines_record     = body.split("<"+record_name+">", -1);

    int num_records           = lines_record.length;

    

    //Get the number of fields

    int num_fields = Integer.parseInt(headers.get("NumberOfFields"));



    //you will send this new message body

    def new_body   = new StringBuffer();

    

    //iterate over records

    for(int r=1; r<num_records; r++){

        //iterate over fields

        for(int f=1; f<=num_fields; f++){

            

            //take the field value

            String field_name      = properties.get("field"+String.valueOf(f));

            String[] lines1        = lines_record[r].split("<"+field_name+">", -1);

            String[] lines2        = lines1[1].split("</"+field_name+">", 2);

            String field_value     = lines2[0];

            

            //take the fixed length for the field

            int    fixed_length    = Integer.parseInt(properties.get("length"+String.valueOf(f)));

            

            //you will keep the padded field value here

            String padded;

            

            //calculate the number of spaces to pad

            //you need to choose a right character code for your language. Ex) Japanese -> Shift_JIS

            int bytes           = field_value.getBytes("Shift_JIS").length;

            int num_to_pad      = fixed_length - bytes;

            

            //if padding is necessary for the fixed length

            if (num_to_pad > 0){

                //pad the field value with spaces

                padded = field_value + new String(new char[num_to_pad]).replace('\0', ' ');

            }

     

            //case when bytes of the field value exceeds the fixed legnth

            else {

                

                //take current bytes and a current length of the field_value

                int cur_bytes = bytes;

                int cur_len   = field_value.length();

                

                //trim the last exceeding characters one by one

                while (cur_bytes > fixed_length){

                    cur_len     = field_value.length();

                    field_value = field_value.substring(0, cur_len-1);

                    cur_bytes   = field_value.getBytes("Shift_JIS").length;

                }

                

                //if the last trimmed character was full-width, you need to pad it with one more space

                if ( cur_bytes != fixed_length)

                    field_value = field_value + new String(new char[1]).replace('\0', ' ');

                

                //set in the padded

                padded = field_value;  

            }

            

            //append in the new_body

            new_body.append(padded);

        }

        

        //append a line break (CR/LF)

        new_body.append('\r\n');

    }



    //set the new body back to the message body

    message.setBody(new_body);

    return message;

}

6. Check output file.

As you can see, the first characters of each PIC fields and of Location fields are aligned beautifully.