Adobe form XML parsing error due to invalid characters
This blog relates to generation of an Adobe Form via transaction code CORRHIST in SAP HANA. Normally when selecting a document to print and output to PDF as a spool file there is no issue, but in some instances the document may fail to produce due to Adobe Document Services (ADS), encountering an error with invalid characters that it cannot handle. In this instance an error occurs and the document will not produce.
The error in the above instance reads as “Error calling ADS: ADS: com.adobe.ProcessingException: com.adobe.ProcessingException: 0 : InvalidDataException: Xml parsing error: reference to invalid character number(error code 14), line 1, column 670 of file (200,1”
Most likely cause of this error
The most likely cause of this issue is invalid data entered by a person using copy and paste where the copy picks up characters that they could not type into the screen, but when pasted and saved, these characters enter the database. When the database field containing the error is then used on the form and ADF tries to convert it, the failure occurs.
Finding the problem example 1 – with variable analysis
First thing with this type of error is that it is most likely data related, but the problem data is only being picked up during the attempt to convert the document content to XML. This doesn’t give you a specific line in the form interface therefore to debug to, and thus find the field with the dud data.
Instead, you must look at the whole unconverted data.
Step 1 is therefore to add a breakpoint in INCLUDE program LFPCOMPFRM at the point it calls form CREATE_OUTPUT_CURRENT_FORM (shown below).
Execute the printing of the form from CORRHIST until it stops at this breakpoint.
Once at the breakpoint, double click on the field P_DOCUMENT to bring up the structure.
There are quite a few fields in here, but the one of interest is XFD, containing the unconverted data for the form. Double click on this field.
The initial display shows a hexadecimal string that is of little value to look at.
Change the view type to “VAR_XML XML Browser” view.
In the above view, the text immediately below the pink section, shows the form text up to the point the first error occurs. (Note, some information redacted for privacy reasons). The last point shown gives you some indication of where in your document the problem exists. You can compare this with the layout of your document in the form editor of transaction EFRM.
The message in the pink section above tells us what the character value in error is, as it says XMLChar value 16. In the ASCII chart shown below, 16 is represented in the decimal column for a data link escape character, which is the character that is unable to be converted.
To see where this invalid character falls within the actual text, the text must be expanded by changing the view type to “VAR_HTML HTML Browser”.
The HTML view expands the text past the point of the error. Looking at this you will see that SAP has replaced the erroneous character 16 with a little square.
Now looking back at your form design under transaction EFRM you should hopefully be able to identify what field would be representing this data. You can then debug the form interface and bring up the value in debug by placing a breakpoint at the point the field’s value is set.
Above the field in the form interface was determined to be GV_LONG_DESC, in which you can see the value contains a Hash # character. This is another replacement character, since under debug the actual value cannot be displayed.
Since we already know the dud character is 16 representing a Data Link Escape character, this can now be found and fixed within the data through data maintenance, or specifically coded to be found and replaced while processing the field in the form interface.
Once fixed the form should print successfully.
Finding the problem example 2 – with additional table field analysis
In the above example it was possible to determine the exact value of the dud character by looking at the pink section of text under “VAR_XML XML Browser” view. This identified the error character to be character 16. It could only display this however as a little square box under “VAR_HTML HTML Browser” view, and as a Hash # character when viewing the variable content in debug. This is because the debugger provides no way to convert the # character it displays, into the actual ASCII/HEX characters that the hash has substituted.
When the dud field occurs in a table however, then we do have some further options to see what the actual value is.
In this example we follow exactly the same initial steps of the first example, being to add a breakpoint to in INCLUDE program LFPCOMPFRM at the point it calls form CREATE_OUTPUT_CURRENT_FORM, and analyse the output of field P_DOCUMENT-XFD. In this example below, the dud character is identified as character 18.
Once the view type is switched to “VAR_XML XML Browser”, the error data can be seen in this case as two little squares, indicating that there are actually two values in error.
Judging by the repeating nature of the content in which the error characters are displayed, it seems apparent that the error data sits in some table data. Checking the form layout under transaction EFRM it was determined this was in table GT_TIMESHEET, which was then able to be debugged through the form interface.
When viewed in ALV grid format the error shows as two square boxes in this case.
Because this error has occurred in a table we get the option to export this data. Exporting should be to a local file.
When saving the file, ensure that it is saved in unconverted format to preserve the error characters.
Next name the file with a .TXT file type and save to a location of your choice.
To view the saved file you need a viewer that can handle all ASCII/HEX characters and show you their actual ASCII/HEX values. For this example I have used Notepad++.
Once in Notepad++ the little boxes are no longer display and instead the ASCII/HEX fields are displayed, being in this case the values DC2 and DC3.
Looking at the ASCII table we can find DC2 and DC3 are device control characters 18 and 19. Looking back at the pink section error for this, we see it said the first character in error was 18, so we have a match.
Neither of these device control characters should be in the data and need correcting by the business before the document can be printed.
This blog highlights one particular problem encountered when generating Adobe forms from transaction CORRHIST. This problem is simply that the document is unable to be produced due to Adobe Form Services having encountered invalid characters that it cannot process.
The method I have shown to discover the invalid characters does require a level of understanding of debugging code, but if you already have this knowledge then by adding the initial breakpoint where I have shown, you should be quickly able to drill down on the data to find the problem characters.
I have illustrated this problem where the error occurs in a variable and then within a table. Both methods identify the actual first character causing issue within the pink text shown when viewing the field P_DOCUMENT-XFD in debug under “VAR_XML XML Browser” view. The table view provides a greater level of analysis in which you can identify the the problem data more clearly using Notepad++.
If the ASCII/HEX character can be identified then it is possible if you wish to add code into your form interface to find and replace the character and thus avoid the failure of the form, however you need to consider if this is sensible or if instead the source data should be corrected. This is a decision for your business to make.
Please follow my profile Stephen Lasham