HCI: XML to CSV conversion in HCI
Introduction
HCI provides functionality to convert between XML to CSV and vice versa. Compared to PI, its functionality is relatively rudimentary and can only cater for very simple structures.
The online documentation (Defining Converter) only covers the functionality briefly, and there is no other article on SCN covering it.
Therefore I tried experimenting with the functionality and this blog covers my experience doing so.
Component Details
As HCI is a cloud solution with automatic rolling updates, my testing is based on the following component versions of the tenant and Eclipse plugins.
HCI Tenant Version: 2.8.5
Eclipse Plugin Versions: Adapter 2.11.1, Designer 2.11.1, Operations 2.10.0
Example Scenarios
The following is mentioned in the online documentation, therefore I could only test out the following two scenarios.
You cannot use XML to CSV converter to convert complex XML files to CSV format.
For simplicity sake, the iFlows are designed with a timer to trigger the iFlow upon deployment and a Content Modifier to provide static input data to the converter. The output is then sent to a HTTP logging server.
Scenario 1 – Structure with single record type
In this scenario, the input payload is defined with a Records root node and an unbounded Line nodes.
Input Payload |
---|
<?xml version=’1.0′ encoding=’UTF-8′?> <Records> <Line> <Field1>ABC</Field1> <Field2>123</Field2> <Field3>XXX</Field3> <Field4>567890</Field4> </Line> <Line> <Field1>XYZ</Field1> <Field2>456</Field2> <Field3>YYYY</Field3> <Field4>98765</Field4> </Line> </Records> |
The data is contained in the repeating Line nodes. So, configuration of the converter is as simple as entering the XPath to the Line node, i.e. /Records/Line. The other options are specifying the field separator as well as the column names as header.
With this configuration, the conversion’s output payload is as follows.
Output Payload |
---|
Field1,Field2,Field3,Field4 ABC,123,XXX,567890 XYZ,456,YYYY,98765 |
Scenario 2 – Structure with header record type and repeating details record type
In this scenario, we have additionally a Header node.
Input Payload |
---|
<?xml version=’1.0′ encoding=’UTF-8′?> <Records> <Header> <FieldA>H_ABC</FieldA> <FieldB>H_123</FieldB> <FieldC>H_XXX</FieldC> <FieldD>H_567890</FieldD> </Header> <Line> <Field1>ABC</Field1> <Field2>123</Field2> <Field3>XXX</Field3> <Field4>567890</Field4> </Line> <Line> <Field1>XYZ</Field1> <Field2>456</Field2> <Field3>YYYY</Field3> <Field4>98765</Field4> </Line> </Records> |
In addition to the configuration above, we can configure the conversion of the “parent” element in the Advanced tab. The configuration is as simple as selecting Include Parent Element and specifying the XPath to the Header node.
With this additional configuration, the conversion’s output payload is as follows.
Output Payload |
---|
FieldA,FieldB,FieldC,FieldD H_ABC,H_123,H_XXX,H_567890 Field1,Field2,Field3,Field4 ABC,123,XXX,567890 XYZ,456,YYYY,98765 |
A particular point of interest is that the converter automatically includes an additional blank line in between the header line and the detail lines.
Additional Findings/Issues
Besides this simple conversions, during my testing of the function, I’ve come across the following issues.
i) Missing enclosure of fields that contain separator
According to RFC 4180 – Common Format and MIME Type for Comma-Separated Values (CSV) Files:-
Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes.
However, the converter does not handle this properly. As shown below, Field1 contains a comma. However in the output, this field is no enclosed with double quotes. As such, it will potentially cause issue for applications that try to process the CSV content.
Input | Output |
---|---|
<?xml version=’1.0′ encoding=’UTF-8′?> <Records> <Line> <Field1>AB,C</Field1> <Field2>123</Field2> <Field3>XXX</Field3> <Field4>567890</Field4> </Line> <Line> <Field1>XYZ</Field1> <Field2>456</Field2> <Field3>YYYY</Field3> <Field4>98765</Field4> </Line> </Records> |
Field1,Field2,Field3,Field4 AB,C,123,XXX,567890 XYZ,456,YYYY,98765 |
ii) Include Parent Element setting still valid even after being unchecked
If Include Parent Element is checked and Path to Parent Element is populated (as shown in Scenario 2’s screenshot), even if the setting was unchecked later, the converter still performs conversion for the parent element. The workaround for this is to ensure that the Path to Parent Element is cleared off prior to unchecking Include Parent Element.
Further Points
Although the scope of this blog is on the XML to CSV converter, I also tried out the CSV to XML converter functionality. However, I was unable to get it to work successfully. Again, the example on the online documentation is quite vague and there were no other materials on SCN to assist.
Following is a configuration of the CSV to XML converter that was being tested.
However, during runtime, the following error is triggered. I’ve tried various values for Path to Target Element in XSD but none was successful.
java.lang.IllegalStateException: Element name [DT_HCI_Conversion\Line] not found in provided XML schema file
This is similar to the error in the following thread which also remains unanswered.
iflow in HCI having error at csv to xml converter step
Conclusion
Although HCI comes with built-in functionality for XML to CSV conversion (and vice versa), the functionality is still very limited and buggy. The use case for the converter is restricted to just simple scenarios. As such, until this functionality is enhanced in future updates, more complex conversions would most likely require custom development in the form of custom Groovy scripts.
Hello Eng Swee,
Thanks for the Nice blog ! Even I have noticed lot of improvement points to get the Converter more usable [ like how to handle CSV with Header and Footer ]
I have replied to iflow in HCI having error at csv to xml converter step
If you can paste your required target XML we can debug the issue which you are facing for CSV to XML conversion.
Thanks and Regards,
Sriprasad Shivaram Bhat
Hi Sriprasad
Thank you for your comment.
Below is the schema that I used along with the conversion parameters above.
All the rest of the parameters are similar to your reply in the thread. I've tried all different values for "Path to Target Element in XSD" (backslash, forward slash, etc) but it didn't work.
Unfortunately, I no longer have access to the trial client to test this again. Maybe you can try it out and post some screenshots of a positive conversion result.
Rgds
Eng Swee
Hi Siva,
any update on the issue mentioned?? even i am trying csv to XML which is not successful 🙁
-Madhav
Hello Madhava.
Below post will give more insight to you.
https://archive.sap.com/discussions/thread/3840352
And I think its better if you can create new post with your question instead of asking the query in comment section of the blogs.
Regards,
Sriprasad Shivaram Bhat
Hello Eng Swee,
Excellent blog! It has helped me a lot!
I have a question, when you add a customize header to the .csv file, Is there a way to remove the additional blanking line between header and elements?
Regards,
Hello im new in HCI, we are developing an integration and the output must not have a field separator, is that possible?
I am pretty sure you cant ... but you could use the conversion , and then have a java script to say replace the pipeline line - just a suggestion!
Hello Eng Swee,
Congratulations on your publication, it is very helpful.
When I did a conversion based on your blog, I encountered a problem: after the XML to CSV converter, I got an error: Unmarshalling XML failed.
To resolve this error, I removed the header:
<? xml version = ‘1.0’ encoding = ‘UTF-8’?>
After that, it worked perfectly.
Regards,
Ranier Monteiro
Hi Eng
Thanks for the blog.
I have a requirement to pass a CSV with entity ( I mean Header and Detail in the same CSV) and Mapp it to XML.
Is it possible in HCI ?
Best Regards
Cesar
Hi eng swee,
XPATH expression should be absolute only? I tried giving //Line but I got only blank output
Thanks,
Dinesh