ExcelTransformBean Part 1: Convert various Excel formats to simple XML easily
Update 2 Oct 2018: Now available in CPI as well.
Update 10 May 2016: Add new optional parameters headerRow and onlyValidCharsInXMLName (courtesy of Jacob Frank Vandborg).
Update 7 Aug 2015: Add new parameter columnOffset.
Update 25 Mar 2015: ExcelTransformBean has been refactored to be part of FormatConversionBean. Parameter conversionType replaced with converterClass. Parameter indentXML replaced by indentFactor.
Update 11 Feb 2015: Updated source code links to shared module repository on GitHub. Previous GitHub repository will be deleted.
Update 27 Jan 2015: Updated Source code section to indicate EAR deployment file released for download at GitHub repository.
Update 3 Nov 2014: Refactoring of source code to cater for SimpleExcel2XML and SimpleXML2Excel conversions.
Introduction
There are already a handful of blogs on SCN dealing with Excel to XML conversion – so why another one??
Here is my wish list for a comprehensive solution:-
- Able to read all kinds of Excel format (XLS and XLSX)
- Behaves in a similar way as MessageTransformBean
- Highly configurable – develop/deploy once, use multiple times
- Able to handle XML special characters
In the reference section below are some of the more popular approaches, however below are some of the (non-exhaustive) limitations/drawbacks
- XLSX files stores string contents in a separate sharedStrings.xml file in the zipped XLSX file
- JExcel API does not support Excel 2007 XLSX formats
- Limitations in handling formulas, formatting and special characters in cells
ExcelTransformBean (ETB) aims to provide a generic adapter module solution (a la MessageTransformBean) that is highly configurable and reusable. It is based on the Apache POI API. Utilizing the combined SS interface of the API, it uses a single logic to read all kinds of Excel files (XLS and XLSX).
This first part covers Excel to simple XML conversion, while the second part covers simple XML to Excel conversion.
Source Code
Refer to following blog on location of source code and/or EAR deployment file.
FormatConversionBean – One Bean to rule them all!
This converter is based on Apache POI 3.9 library. In order for the Java project to compile and build successfully, the following JAR files need to be referenced/imported into the project.
- poi-3.9-20121203.jar
- poi-ooxml-3.9-20121203.jar
- poi-ooxml-schemas-3.9-20121203.jar
- xmlbeans-2.3.0.jar
- dom4j-1.6.1.jar
The library files can be downloaded from Apache’s website, direct link to the ZIP file is provided below.
Note: CPI version has been updated to Apache POI 3.17 – refer to FormatConversionBean arrives in CPI for details on library dependencies.
Usage of Module in Communication Channel
Module Processing Sequence
Number | Module Name | Type | Module Key |
---|---|---|---|
<Depending on position of module in chain> 1) Asynchronous scenario, Normally before the last module in channels 2) Synchronous scenario, Before last module to convert request payload After last module to convert response payload |
Custom_AF_Modules/FormatConversionBean | Local Enterprise Bean | <Any Arbitrary Value> |
Module Parameter Reference
Below is a list of the parameters for configuration of the module for Excel to XML conversion. Certain parameters will automatically inherit the default values if it is not configured.
Parameter Name | Allowed values | Default value | Remarks |
---|---|---|---|
converterClass | PI – com.equalize.xpi.af.modules.excel.Excel2XMLTransformer CPI – com.equalize.converter.core.Excel2XMLConverter |
Required field. Determines conversion class. | |
sheetName | The name of the active Excel sheet to extract. Either sheetName or sheetIndex must be populated. | ||
sheetIndex | Integer values beginning from 0 | The index of the active Excel sheet to extract (starts from 0). Either sheetName or sheetIndex must be populated. | |
skipEmptyRows | Y, N | Y | Empty rows to be skipped or not |
rowOffset | Integer values beginning from 1 | 0 | Starting row to begin extracting content from (i.e. 0 = start from first row, 1 = start from second row). If processFieldNames = ‘fromFile’ and rowOffset = 0, first line will be automatically be headerRow+1 |
headerRow | Integer values beginning from 1 | 0 | Available only when processFieldNames = ‘fromFile’. Determines which row to retrieve header column names from (0-based so 0 is first row, 1 is second row) |
columnOffset | Integer values beginning from 1 | 0 | Starting column to begin extracting content from (i.e. 0 = start from first column, 1 = start from second column) |
processFieldNames | fromFile, fromConfiguration, notAvailable | Required field. Determines the naming of each column of the rows, and the number of columns to extract:
|
|
fieldNames | Name of columns. Required field when processFieldNames = ‘fromConfiguration’ | ||
onlyValidCharsInXMLName | Y, N | N | Available only when processFieldNames = ‘fromFile’. Removes invalid XML characters from column names retrieved from header row of sheet |
columnCount | Integer values beginning from 1 | Number of columns for extraction. Required field when processFieldNames = ‘notAvailable’ | |
recordName | Record | XML element name for row of record in output | |
documentName | Required field. Document name of root element of XML output | ||
documentNamespace | Required field. Namespace of root element of XML output | ||
formatting | excel, raw | excel | Controls how the cell contents are formatted in XML output
|
evaluateFormulas | Y, N | Y | Controls how cell contents with formulas are displayed in XML output
|
emptyCellOutput | suppress, defaultValue | suppress | Controls how empty cells are displayed in XML output
|
emptyCellDefaultValue | <blank> | If emptyCellOutput = ‘defaultValue’, all empty cells will be populated with value in this parameter | |
indentFactor | Integer values beginning from 1 | 0 | Determines the number of indentation spaces for each level in the XML output |
debug | Y, N | N | Displays contents in Audit Log of each cell extracted. WARNING: Use this only for debugging in non-productive systems |
Example Scenarios
Here are some example scenarios of the behavior of the conversion based on different configuration options.
Scenario 1
Excel 2007 XSLX file format.
Extract Sheet1 with column names determined directly from header line of file.
Special character & automatically converted
Module parameters
Parameter Name | Parameter Value |
---|---|
converterClass | com.equalize.xpi.af.modules.excel.Excel2XMLTransformer |
sheetName | Sheet1 |
processFieldNames | fromFile |
documentName | MT_Order |
documentNamespace | urn:equalize:com |
Result
Scenario 2
Excel binary XLS file format.
Extract sheet at index 0. Column names are provided from configuration.
Row offset provided to skip first two lines.
No formatting of cells, so raw values displayed.
Module parameters
Parameter Name | Parameter Value |
---|---|
converterClass | com.equalize.xpi.af.modules.excel.Excel2XMLTransformer |
sheetIndex | 0 |
processFieldNames | fromConfiguration |
fieldNames | Order,Date,Material,Quantity |
rowOffset | 2 |
recordName | Line |
documentName | MT_CustomOrder |
documentNamespace | urn:equalize:com |
formatting | raw |
Result
Scenario 3
Excel 2007 XSLX file format.
Extract sheet at index 0.
Column names are not available. Number of columns = 5.
Row offset provided to skip first line.
Empty rows are included.
Cells with formula are displayed with formula instead of result.
Empty cells are displayed with default value “space”.
Module parameters
Parameter Name | Parameter Value |
---|---|
converterClass | com.equalize.xpi.af.modules.excel.Excel2XMLTransformer |
sheetIndex | 0 |
processFieldNames | notAvailable |
columnCount | 5 |
rowOffset | 1 |
documentName | MT_CustomOrder |
documentNamespace | urn:equalize:com |
skipEmptyRows | N |
evaluateFormulas | N |
emptyCellOutput | defaultValue |
emptyCellDefaultValue | space |
Result
Reference
Part 2 – ExcelTransformBean Part 2: Convert simple XML to various Excel formats easily
This article does not cover the steps for creating a custom adapter module. This can be found easily via SCN search. It is also listed in the reference section of the article below regarding adapter module testing.
Standalone testing of Adapter Module in NWDS
Alternative methods for Excel conversion
PI/XI: Reading MS Excel’s XLSX and XLSM files with standard PI modules – easily…
Excel Files – How to handle them in SAP XI/PI (The Alternatives)
Great!!! many thanks!!
Hello Eng Swee Yeoh,
thank you for this very interesting and useful adapter module. This will make life much easier dealing with EXCEL sheets without having to write adapter modules anymore.
Best regards,
Peter
Hi Eng Swee Yeoh,
Thanks for BLOG,
I am little confused at using "sheetIndex" value. As I have a requirement to read Excel which has 4 Tabs in it and I need to read first 2 Tabs data and Map it back to RFC.
And not clear on how to build Data types\Message Types for each Excel sheet fields.
Is this requirement is possible using this Example Scenario 2.I have already raised SCN query on this. It will be helpful for me if you respond to my query.
http://scn.sap.com/thread/3763584
also please let me know where to keep these jar files in PI ..
Thanks & Regards
Siri
Dear Siri
Thank you for your interest in this adapter module.
Unfortunately, the module as it is can only read 1 sheet in an Excel file. If you want to be able to read more, you can download the source code and further enhance it accordingly.
Regarding the JAR files, these are bundled in the adapter module's EAR file and are automatically deployed into the PI system once the EAR file is deployed. You do not need to manually upload them into PI's file system.
Rgds
Eng Swee
Dear Eng,
Please help with the below thread. Thanks.
https://scn.sap.com/thread/3765676
Hi Eng
I like this pose. But unfortunately I do not know how to use this. I downloaded the poi-bin-3.9-20121203 zip and gave it to our basis team to upload in the server. They said that it was uploaded successfully. But we cannot find the modules. Here below is the screen shot from Netweaver application module. Your help is greatly appreciated.
netweaver application module
Thanks
Natarajan
Hi Eng Swee Yeoh ,
Facing error in comm channel on implementing above steps. Please help. Many Thanks....!!
Implemented Scenario 1
Hi Rajesh,
it seems that you didn't deploy the module.
Please check this out.
Hi Alexey,
Could you please brief or elaborated. I didn't get you.
You mean we need to deploy the module(Custom_AF_Modules/FormatConversionBean) as followed in below link: I guess it involves basis work.
http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/c0b39e65-981e-2b10-1c9c-fc3f8e6747fa?overridelayout=t…
Please correct me If I'm wrong.
Donwload the EAR file. Its in the blog aabout FormatConversionBean.
Then open NWDS and connect it to your system(Java As).
Open Deployment perspectinve - import EAR, and press start.
If successfull pop up window appeared then you can use the module in adapter/
Hi Alexey,
If you have the steps of procedure to work upon for FILE(EXCEL) to FILE(XML) scenario in PI7.0
Please share. Many Thanks.
Hi Rajesh,
please check in NWA, that the following module is successfully deployed in your system.
Hi Alexey,
I tried to reply to your earlier comment about the date formats but I think it was deleted before I managed to post the comment. Anyway here it is below if you still need it:-
-------------------
Thank you for the feedback. Glad to know that the modules were deployed successfully and works.
Regarding the Excel format, to be honest I haven't tested it extensively with different users and date formats.
The code for it uses the Apache POI DataFormatter as shown below. So that is the default behaviour.
com.equalize.xpi.af.modules/Excel2XMLTransformer.java at master · engswee/com.equalize.xpi.af.modules · GitHub
DataFormatter formatter = new DataFormatter(true);
cellContent = formatter.formatCellValue(cell);
Unfortunately, I am quite tied up at the moment and do not have time to look into this. As the source code is available publicly on GitHub, may I suggest that you try to modify the module and test if there is some Apache POI method that can cater for it.
You can build the module and perform standalone testing on it following my blogs below:-
Recompile com.equalize.xpi.af.modules as EJB 2.1 modules in NWDS 7.1x
Standalone testing of Adapter Module in NWDS
Rgds
Eng Swee
Hi Eng Swee,
I deleted the comment because I figured out that the problem was with Macros in the Excel form.
Alexey.
Hi Rajesh
Note that the module and EAR file are not compatible with PI 7.0. I'd suggest you search for other alternative solutions for that particular PI version.
Rgds
Eng Swee
Hi Eng Swee,
Thanks alot for the information.
Could you please suggest if any. I'm just strucked up with this scenario.
Can the below links be implemented in PI7.0:
http://scn.sap.com/docs/DOC-58278
You have some alternatives more:
PI/XI: Reading MS Excel's XLSX and XLSM files with standard PI modules - easily...
Hello Eng,
Your module has made the life easy. Thanks for that. I tried your module for an excel file with FTP adapter. It worked without any issues. My requirement is to use with an sender email adapter, where xlsx file will come as an attachment. I used PayloadZipBean, PayloadSwapBean, FormatConversionBean, XIMailAdapterBean in order with appropriate parameters. It processed the file content using the parameters given in configuration(i could see in audit log with debug on and off) but it fails in intergration engine due to the fact the converted content is still in the mailattachment-1 and not in the main mail message payload. I tried using one more payload swap bean after format conversion bean but it still fails in the integration engine. I can see that there is no problem in your module processing but just want to ensure have you faced any issues with mail adapter. Any help would be appreciated. Thanks.
Server: PI7.31 Dual Stack
Regards
M
Hi Manikandan Rajendran,
why do u use PayloadZipBean? its not necessary, is your attach in not zipped.
Please exclude from module chain and it should work.
Use first - localejbs/AF_Modules/PayloadSwapBean with
swap.keyName payload-name
swap.keyValue MailAttachment-1
Second - FormatConversionBean
configure it according your requirement
Third - XIMailAdapterBean - standard one
Thanks Alexey for your assistance here.
Manikandan, please follow Alexey's suggestions. You don't need to unzip, just need to swap the attachment to the main payload before using FormatConversionBean.
Rgds
Eng Swee
Hi Eng,
Thanks for presenting the module.
I have a interface Sender mail adapter containing .xslx file in attachment.I need to convert the .xslx file to XML and then use it.
I am using the module in the below sequence..as in the screenshot.
For the PayloadSwapBean the params are:
swap.keyName =Payload-Name
swap.keyValue1
But the problem I am getting is the payload still is in .xslx format.
This is why the mapping is failing.
But astonisingly if I dowload the MailAttachment-1.xslx file and see that,it opens in notepad++, and that is the converted xml.
So the TransformationBean works but my purpose is not solved.
Please help me with suggestion.
Hi Abhishek
Thanks for your interest in this development.
As you are relatively new to SCN, I'd suggest that you post your query as a discussion thread. It is the appropriate place as the comments section of blogs are not suitable for detailed discussions to tackle specific issues.
Just as a note, I suspect that the PayloadSwapBean might not be working correctly. Can you please provide screenshot of the full configuration parameters for all modules? Also provide screenshot of the audit/message log showing the execution of the modules.
Regards
Eng Swee
Hi Eng, Opened a new discussion Added Parameter for PayloadSwapBean.
Hi Eng,
Thanks for this blog and assistance in using this module.
I am able to successfully test this as a standalone module in NWDS.
However after deploying it to our PI 7.4 system, I get an error:
Error: java.lang.IllegalArgumentException: Your InputStream was neither an OLE2 stream, nor an OOXML stream
Do you perhaps know how I would go about fixing this error?
Thanks,
Miguel
Hi Miguel
From the first line of your log, it seems there is already a conversion to XML. Is this done by some standard FCC or MTB?
Please note that this module expects an Excel file in either binary XLS format or the newer XLSX 2007 format (refer Apache POI - the Java API for Microsoft Documents). The error you are getting is due to the input not being in either of these formats.
Rgds
Eng Swee
Hi Eng,
Thanks for the reply.
I have tried it with both formats and it fails. I then tried the module with a stand-alone test on my local machine through the NWDS and the process worked. This locally working file, is the file that I am trying to test through PI now and getting this error message.
Thanks,
Miguel
Hi Miguel
Can you explain what is happening in your channel that causes the file to be converted to XML as shown in the first line (highlighted below)? Are you using File Content Conversion in the channel? Can you share your channel's module configuration screenshot?
Rgds
Eng Swee
Hi Eng,
Yes, I had originally configured the channel for FCC. As soon as I removed the FCC, it worked perfectly.
Thanks very much for your help.
Thanks,
Miguel
Hi Eng,
I have a scenario from MAIL to IDOC where excel file comes as an attachment. This excel sheet contains two worksheets one has header and another has the item level details.
Using the above i was able convert the excel to xml but only a single sheet at a time .Could you please let me know how can i use this to read two worksheet from the excel at the same time.
Thanks,
Shradha.
Hi Shradha
The design of this converter is to read one sheet only. If you want to read more than one sheet you can enhance it according to your needs. The source code is publicly available as described in the Source Code section above.
Regards
Eng Swee
Hi Eng,
I am trying to implement excel to xml conversion from your below blog -
ExcelTransformBean Part 1: Convert various Excel ...
But After deploying it in my java stack I am getting below error and its deployed with warning as well -
Error while obtaining business methods info for beansap.com/Excel2XMLTransformerEAR*xml|Excel2XMLTransformerEJB.jar*xml|Excel2XMLTransformer. Most probable reason: application sap.com/Excel2XMLTransformerEAR cannot be started. See logs available for this application.
These are the steps I have followed. I think I have done something wrong. Can you please help me -
1. Created EJB 3.0 project with name Excel2XMLTransformerEJB and added to EAR - Excel2XMLTransformerEAR
2. Added all required jars in ejb project build path.
3. In EJB project created one session bean (EJB 3.x) with name Excel2XMLTransformer.
Created normal class with below names -
AbstractModuleConverter
AuditLogHelper
DynamicConfigurationHelper
ParameterHelper
4. Build the EJB project
5. My ejb-j2ee-engine.xml value -
6. My ejb-jar.xml value -
7. Export as SAP EJB JAR File
8. In EAR application-j2ee-engine.xml value -
9. Build and export as SAP EAR File.
10. Finally Run on server. its deployed with warning.
Can you please help me with the correct and exact steps. Its very urgent
Thanks in advance for you help.
Regards,
Apu
Unless you are asking for clarification/correction of some part of the Document, please create a new Discussion marked as a Question. The Comments section of a Blog (or Document) is not the right vehicle for asking questions as the results are not easily searchable. Once your issue is solved, a Discussion with the solution (and marked with Correct Answer) makes the results visible to others experiencing a similar problem. If a blog or document is related, put in a link. Read the Getting Started documents (link at the top right) including the Rules of Engagement.
NOTE: Getting the link is easy enough for both the author and Blog. Simply MouseOver the item, Right Click, and select Copy Shortcut. Paste it into your Discussion. You can also click on the url after pasting. Click on the A to expand the options and select T (on the right) to Auto-Title the url.
Hi Michael,
Already I have started an discussion but not sure how can I directly reach to Eng, so pasted same here .
Here is the discussion link - Error in ExcelTransform Bean
Thanks,
Apu
Right click on Eng's name link. Copy the url. Paste it into your Discussion. Mouse over the link and click on the A popup, then select T for auto title.
That will bring it to Eng's attention.
Cheers, Mike (Moderator)
SAP Technology RIG
Thanks Mike for helping to moderate in the PI space 🙂
HI Eng,
Kindly help to check the following error while reading excel file. Current version PI 7.4
Error :
Hi Eng Swee,
Good day!
Thanks for sharing this really helpful blog!
Just a question, is there a capability under this bean that will allow to split a single excel xslx file with 2 sheets into 2 separate xlsx/csv files? 😉 No transformation required, just need to split this single file. Do I just convert the xlsx file into an XML and apply the multi-mapping before converting it back into xls? 😀
Cheers,
Madina
Hello,
Successfully deployed the EAR file. Version of PI is 7.4 SP 8.
I am getting the below error:
Below are the configuration details in the Communication Channel.
Hope you can help me.
Kr,
Bryan
I suggest you perform standalone testing in NWDS so that you can debug your error. You can check one of my other blogs which details more on that.
it seems like can't add xml.recordsetName & xml.<StructureName>. just like the standard FCC,since I got a deep structure with multiple nodes,is there any solution deal with this?
BR.
wyz
Hi Eng Swee Yeoh,
thank you good blog,
We are given xls format in receiver communication channel but the output file data is asking like XML file format, how to default come us xls format can u have any idea
please find below screenshots
for those popups, we don't want how to directly come
Hi Eng Swee Yeoh,
I have implemented this excel transformation bean. I would like to check is there anybody implemented the dynamic sheet name in this sheet name parameter.
As per my requirement, I would like to pass the APR-19(month name and two digit year). Any inputs on this requirements.
Hi Rajani
Module parameters in PI are always static values - so you won't be able to pass dynamic values to sheetName parameter.
Regards
Eng Swee
Hi Eng Swee Yeoh,
Thanks to the ear file you have provided, i was able to deploy com.equalize.xpi.af.modules.app.ear using NWDS.
The version of PI is 7.5 and i am using the Excel to xml transformation,
i configured the parameters in the sender CC as detailed below and moved on for testing,
i am getting the error as attached in the screen shot, can you please guide me to resolve this
appreciate your response Thanks in advance
Karthik G
Swap the sequence of the module.
Thanks for your response, i have swapped and it worked.
Really appreciate your work, thanks for this module 🙂
Regards,
Karthik G
Hi Eng Swee ,
I am using Custom_AF_Modules/FormatConversionBean module in SAP PO to convert excel to xml file.
Excel have 5000-10000 records.
But it's failing at channel level with below error.
Sender Channel Configuration:
Channel Config
PO Issue
Appreciate your response Thanks in advance.
Thanks,
Kuldeep Kumar
Hi Eng Swee
Hope you are doing good, i am also facing similar kind of issue but when i open that excel, select the cell and optimize and save and run the communication channel it is accepting the file and able to read the contents of the file, unable to understand this strange behaviour.
let me explain my scenario.
Problem : i used the configuration ExcelTransformBean Part 1: Convert various Excel formats to simple XML easily however it is failing stating that nested exception at index .......
but the same excel if i open it select columns, optimize, save it. Then if i run the Communication Channel it is getting converted.
If you manage some time can you please have a look at it and provide your suggestions.
Thanks in Advance
Karthik G
Hi Eng, Team,
I have uploaded all the JAR into the CPI and trying to extract excel file but getting error as below
Error:
You need the correct version of Apache POI as listed in the release notes
https://github.com/engswee/equalize-cpi-converter/releases
Hi Eng Swee,
Below links are not working. so I have downloaded latest version of this files from another website. But still I am getting the error "java.lang.NoSuchMethodError: org.apache.poi.ss.usermodel.Cell.getCellTypeEnum()Lorg/apache/poi/ss/usermodel/CellType;" Can you please confirm what will be the issue.
I have used poi-ooxml lite instead of xml-schemas as it thrown error.
http://central.maven.org/maven2/org/apache/poi/poi/3.17/poi-3.17.jar
http://central.maven.org/maven2/org/apache/poi/poi-ooxml/3.17/poi-ooxml-3.17.jar
http://central.maven.org/maven2/org/apache/poi/poi-ooxml-schemas/3.17/poi-ooxml-schemas-3.17.jar
http://central.maven.org/maven2/org/apache/xmlbeans/xmlbeans/2.6.0/xmlbeans-2.6.0.jar
http://central.maven.org/maven2/org/apache/commons/commons-collections4/4.1/commons-collections4-4.1.jar
I had a working scenario:
An automated mail with an excel (xls) attachment to iDoc. I changed the login of the sender mail communication channel from basic authentication to OAuth authentication. Now I get the errors:
failed to call the adapter engine; caused by java.lang.IllegalArgumentException: Your InputStream was neither an OLE2 stream, nor an OOXML stream
Exception caught during processing mail message [0]com.sap.aii.af.lib.mp.module.ModuleException: Your InputStream was neither an OLE2 stream, nor an OOXML stream
Any idea?
Regards Hans
Hello Hans,
The exception you mention usually occurs when no excel files are present in the fetched mail ( or the module is looking in the wrong place ) , so that's where you could look.
For that, I think Note 3137755 might be relevant.
Have you tried to log your incoming message before the conversion step to verify if the reference is still correct?
Regards Emmanuel
Eng Swee Yeoh great blog.
If we want to receive an Excel sheet over an Http Sender channel, what is the expected format of the request body? Binary? Base64?
Thanks!
C.