XML Processing in ABAP – Part 7: The Power of XSLT 2.0
In the part one up to six of this weblog series I discussed mostly simple transformations. Now it is time to write about the most powerful XML tranformation techniques in ABAP: XSLT that is integrated into ABAP by the CALL TRANSFORMATION command.
XSLT 2.0 – What’s new?
There are problems which are very difficult to solve in XSLT 1.0, think of grouping for example. There are even untractable tasks:
- You can’t access nodesets stored in variables via XPath because they are a result fragment tree.
- An XSLT program can only create one output document. To create multiple documents you have to do postprocessing or apply multiple transformations but this can be very slow because the DOM tree of the document will be generated a few times.
To overcome these difficulties we use XSLT 2.0. In one of my last projects those transformations should run under Java on a non-SAP platform but later it became clear that those transformations should be working in ABAP, too. So we had to develop in a way that we could migrate those transformation to the ABAP XSLT processor such that there only a few changes are necessary. In that this turned out to be a challenging task. So let’s start to talk about the differences between XSLT 1.0 and XSLT 2.0.
When you start programming you will recognize that XSLT 2.0 is strongly typed; in fact you have even node typing via W3C Schema integration. XSLT 2.0 is based on sequences. Sequences are similar to nodes but they are ordered and allow dublicates. If you select a node with an XPath 2.0 expression you will get a sequence of all matching nodes and not the first one like in XSLT 1.0. In XPath 2.0 you have lots new functions including regular expression support and there is even more text-processing support in XSLT 2.0, although in my opinion from a conceptual point of view STX has better text processing functions. If you look at these features of XSLT 2.0 you will soon recognize that they are not supported in ABAP. So let’s have a look at the XSLT 2.0 features we can use under ABAP.
XSLT 2.0 Compliance
When the XSLT processor was implemented, the current XSLT 2.0 specification was still under discussion. For this reason, the version which implemented was a W3C Working Draft from 2002. So there are a lot of XSLT 2.0 features. Let me mention a few:
- We can define XPath functions.
- XPath contains an if-then-else construct.
- Grouping is supported with the xsl:for-each-group command.
- We have temporary trees.
- We have multiple result documents.
I will give an example for these features besides multiple output documents. If you are interested in a deeper investigation I suggest you to read my SAP Heft XML-Datenaustausch in ABAP. The english version is coming soon.
An Example: Normalization of Test Data
The following transformation performs a kind of normalization of an XML document. I used it to post process asXML documents – serialized ABAP data – I had to store in a filesystem as test cases. Unfortunately those data contain unique identifiers in elements ID which I had to map according to their lexicographic order to ongoing integers. And in fact this proves the power of XSLT 2.0 features because it would be much more difficult to solve this task in XSLT 1.0
If you look at this program you will recognize that this might be not only the best solution. But I chose it for some reasons: it is quite easy to understand, it contains important XSLT 2.0 techniques you can use in ABAP and last but not least: there is always more than one way to do it in XSLT. In fact I will present a better solution for this problem at the end of this blog.
The transformation is quite easy: there is one template that matches all nodes and copies the content of each one. Then there is a second templates only for elements called ID. Within this template I calculate a number for each alphanumeric ID. In the following sections I will show you how it works.
At the beginning of the transformation I copy the elements ID into a variable, sort these elements and delete adjacent duplicates. I will show this in detail in a later section.
We use this variable in the template only for elements called ID. In this template we evaluate the content but assign a number to it using following XPath expression:
We count all elements ID that have smaller text content compared to the one which is just processed. We stored those elements a a variable for the reason of speed but I will come later to that point. Please recogize this XPath expression is not possible in XPath 1.0 because we can’t access a variable. And we have to use XSLT 1.2 to define own XPath functions. This is what the following section is about.
User defined XPath Functions
In XSLT 2.0 we can compare two strings lexicographically but we can’t neither in XSLT 1.0 nor in the ABAP XSLT processor. So we have to define a XPath function similar to strcmp (just remember C) \ on our own. In fact user-defined functions are a great benefit to XSLT! We define this function util:cmp in an own namespace:
The functions works in a recursive way and uses the proprietary function sap:find-first() to assign a number to a alphanumeric character. Of course the list of alphanumeric characters in the variable alphabet is for from being complete. So it might be better to solve this comparison using ABAP integration.To stop the recursion we need the sap:if command which differs from the corresponding command in the current \ XSLT 2.0 specification.
Please remark that the syntax above for user defined functions differs from the current XSLT 2.0 version.
Good Bye Muenchian Grouping
There are people who claim that the only way to learn XSLT is to study special techniques like Muenchian grouping or Oliver Becker’s intersection method. This may be right but why should we choose the hard way if there is a simple one?
In the SDN-blog Grouping XML with XSLT – From Muenchian Method To XSLT 2.0\ you could read how to group with XSLT 2.0. Under ABAP there is the same statement but some possibilities are not supported. There is no function current-grouping-key() which allows to access the grouping criteria within the template block for instance.
In our example we group the ID elements, sort them according to their value and \ delete adjacent duplicates with a second xsl:for-each-group() command:
We can do better!
I already mentioned that the solution above is far from being good. Le me tell you the reasons. In fact we don’t need to define an XPath function for string comparison because we alread have stored a list of sorted IDs in a variable. To map each ID to an ongoing number we can use XSLT 2.0: \
Here we use the fact that the ID elmeents in the variable are sorted. We can use XPath to query the number of preceding elements for a given ID-value. Here is the complete example:
With this approach we don’t need to call a compare function that supports only a constant number of alphanumeric characters. Moreover, our transformation will be much faster.
This was an introduction the XSLT 2.0 features of the ABAP XSLT processor. I recommend to use them!