Thinking in XSLT: Filtering XML elements
I rarely miss an opportunity to mention how much I like XSLT 😀 Of the three mapping options provided by SAP Cloud Platform Integration, it is by far my favourite one. There is one downside to the language, though, and that is its fairly steep learning curve.
XSLT is declarative by nature (like SQL for relational databases and Cypher for graphs); its approach is what, not how. At the core of that is writing templates to match nodes in the source document, and letting the XSLT processor take it from there.
This approach can take a bit of getting used to, though – especially if you have an imperative programming background. In this blog post series, I’ll give you some pointers on how to start thinking in XSLT. Fair warning: Posts will most likely be published at very irregular intervals 🙂
In this first installment, I’ll be taking a closer look at an XSLT FAQ: How do I filter out elements from my XML document?
The answer to this one is actually quite straightforward. There are two steps to it:
- Start with the identity transformation
- Then create one or more empty templates matching the elements to be filtered out
The identity transformation is a stylesheet that generates XML output, which is identical to its XML input. This is what it looks like in XSLT 3.0 (which is the XSLT version supported by CPI):
That doesn’t sound very exciting, and indeed it wouldn’t be, if not for that
on-no-match attribute. The meaning of
on-no-match="shallow-copy" is that every input node will be copied to the output unchanged, unless it is matched by a template in the stylesheet.
Combine this with the fact that an empty template generates no output, and we have the tools we need to filter out elements from an XML document.
Let’s try this in practice. Here’s a sample document:
Let’s say I want to remove all the odd-numbered lines. To do so, I’ll write an empty template matching the
<line> elements, where the
number attribute is odd. It looks like this:
Combining this template with the identity transformation, we arrive at the following stylesheet:
(I also added the
xsl:output element, which specifies that I’m generating XML and that I would like the result to be nicely indented.)
Here’s the result of transforming our sample document:
The odd-numbered lines are indeed gone, but what’s up with those extra newlines? Well, they’re not really extra at all; they’ve been there all along. The template removed the odd-numbered
<line> elements, but not any surrounding whitespace.
Most of the time, that whitespace is not going to impact the processing of the result, but it does sort of ruin that nice indendation. To remove it, I’ll add the following top-level element to the stylesheet:
xsl:strip-space element instructs the XSLT processor to strip whitespace from inside the
<lines> element. Now, the stylesheet produces the following output:
No odd-numbered lines and no superfluous newlines. We’re done here.
And that’s it for the first post. If there is a particular XSLT topic, you’d like me to cover, please let me know in the comments below. I also welcome any and all feedback.
Until next time!