MS Excel to CSV using Java Mapping in SAP PO

RaghuVamseedhar · ‎09-02-2022

Here is my experience of converting Microsoft Excel file to CSV file in SAP Process Orchestration. Open source Apache POI library is used to read MS Excel files.

Maven pom.xml file

<?xml version="1.0" encoding="UTF-8"?>

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">

    <modelVersion>4.0.0</modelVersion>

    <groupId>com.mycompany</groupId>

    <artifactId>JavaMapping_ExcelToCSV</artifactId>

    <version>1.0</version>

    <packaging>jar</packaging>

    <dependencies>

        <dependency>

            <groupId>manual</groupId>

            <artifactId>SAPPO</artifactId><!--Include com.sap.aii.mapping.api jar file.-->

            <version>1.0</version>

        </dependency>

        <dependency>

            <groupId>org.apache.poi</groupId>

            <artifactId>poi-ooxml</artifactId>

            <version>5.2.2</version>

        </dependency>

    </dependencies>

    <build>

        <resources>

            <resource>

                <directory>src/main/resources</directory>

            </resource>

            <resource>

                <directory>src/main/java</directory>

                <includes>

                    <include>**/*.java</include><!--Include .java source code in final jar file.-->

                </includes>

            </resource>

        </resources>

    </build>

    <properties>

        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>

        <maven.compiler.source>1.8</maven.compiler.source>

        <maven.compiler.target>1.8</maven.compiler.target>

    </properties>

</project>

Java Mapping code

package com.mycompany.javamapping_exceltocsv;



import com.sap.aii.mapping.api.*;

import java.io.*;

import java.util.Iterator;

import org.apache.poi.ss.usermodel.*;

import org.apache.poi.xssf.usermodel.*;



public class JavaMapping_ExcelToCSV extends AbstractTransformation {



    @Override

    public void transform(TransformationInput transformationInput, TransformationOutput transformationOutput) throws StreamTransformationException {

        // Convert Excel (.xlsx) to text (.csv) file.

        InputStream inputStream = transformationInput.getInputPayload().getInputStream();

        OutputStream outputStream = transformationOutput.getOutputPayload().getOutputStream();

        DataFormatter dataFormatter = new DataFormatter();

        StringBuilder output = new StringBuilder();

        

        try (XSSFWorkbook workbook = new XSSFWorkbook(inputStream)) {

            XSSFSheet sheet = workbook.getSheetAt(0); //Read data in sheet 1.

            Iterator<Row> rowIterator = sheet.iterator();

            while (rowIterator.hasNext()) {

                Row row = rowIterator.next();

                Iterator<Cell> cellIterator = row.iterator();

                while (cellIterator.hasNext()) {

                    Cell cell = cellIterator.next();

                    output.append(dataFormatter.formatCellValue(cell)).append(",");

                }

                output.append("\r\n");

            }

            outputStream.write(output.toString().getBytes());

        } catch (IOException exception) {

            getTrace().addDebugMessage(exception.getMessage());

            throw new StreamTransformationException(exception.toString());

        }

    }

}

As input is non XML and Java Mapping is used, Data Type structure can be dummy structure and Integrated Configuration Object (ICO) should not have Software Component Version of Sender Interface.

Testing the Java Mapping in ESR.

Here are some points to note. I prefer Java Mapping over the Adapter module, as the former is relatively easy to maintain.

Maven file (pom.xml) helps to get all dependencies jar files in IDE (Eclipse or NetBeans). In ESR we have to import those jar files. Above image has those jar files names. ESR does not allow to import huge jar files (more than 7-8 MB files), more information is in SAP Notes 2311864, 1580914, 3103535. Sometimes storing Excel as test file in Operation Mapping might cause error when opening Operation Mapping object in ESR.

Iterator method used in the above code will skip the cell if it is empty. If you want to capture empty cells, you have to use 'for loop' on 'sheet.getRow(0).getLastCellNum()', assuming the first row has all required columns. Excel can have multiple sheets, but CSV is a flat file, the above code reads only the first sheet.

Method used in above code 'dataFormatter.formatCellValue(cell)' will return value, how it is displayed in Excel. Above code can handle .xlsx files using XSSFWorkbook class. You can encourage users to use the new Excel XML format .xlsx. If old .xls binary format has to be used, use HSSFWorkbook class.

Sometimes in Excel, what we see is not what we get.

If 12345678901234567890 is entered into a cell it is rounded to 12345678901234500000 (after 15 digits 0 is used) and it is displayed as 1.23457E+19.

If 1.12345678901234567890 is entered into a cell it is truncated to 1.12345678901234 (after 15 decimals numbers are truncated). If we copy this cell, it will result in 1.123456789 (only <10 decimals).

If the number is more than 11 digits, the cell is displayed as a scientific number. 100000000000 is displayed as 1E+11.

If more control is needed, use the method 'cell.getCellType()', it will return STRING, NUMERIC, BOOLEAN, FORMULA, BLANK, ERROR, _NONE. Use it in a switch statement to retrieve value based on cell type.

Excel stores dates as sequential serial numbers. January 1, 1900 is serial number 1. Excel stores time as a fraction of 24. Time 06:00 is stored as 0.25, which is 6/24. If a cell type is numeric, we have to check if the cell is formatted as date using the method 'DateUtil.isCellDateFormatted(cell)' before retrieving its value. You can use 'DecimalFormat("#.##")' if you want to avoid scientific display of a number in output. Sometimes user may store number as text type and date as a general number type in Excel. Those scenarios need to be handled in mapping. If comma is present in text cells then it needs to be replaced with period or colon before output CSV is generated.

Please search https://blogs.sap.com/ for more blogs on this topic. You can check this blog on How to create Java Mapping in SAP PI / PO.

MS Excel to CSV using Java Mapping in SAP PO

SAP PI for Beginners

ABAP 7.40 Quick Reference

Fiori: technical installation and configuration of one app from A - Z