Thinking in XSLT: Filtering XML elements
I rarely miss an opportunity to mention how much I like XSLT 😀 Of the three mapping options provided by SAP Cloud Platform Integration, it is by far my favourite one. There is one downside to the language, though, and that is its fairly steep learning curve.
XSLT is declarative by nature (like SQL for relational databases and Cypher for graphs); its approach is what, not how. At the core of that is writing templates to match nodes in the source document, and letting the XSLT processor take it from there.
This approach can take a bit of getting used to, though – especially if you have an imperative programming background. In this blog post series, I’ll give you some pointers on how to start thinking in XSLT. Fair warning: Posts will most likely be published at very irregular intervals 🙂
In this first installment, I’ll be taking a closer look at an XSLT FAQ: How do I filter out elements from my XML document?
The answer to this one is actually quite straightforward. There are two steps to it:
- Start with the identity transformation
- Then create one or more empty templates matching the elements to be filtered out
The identity transformation is a stylesheet that generates XML output, which is identical to its XML input. This is what it looks like in XSLT 3.0 (which is the XSLT version supported by CPI):
That doesn’t sound very exciting, and indeed it wouldn’t be, if not for that
on-no-match attribute. The meaning of
on-no-match="shallow-copy" is that every input node will be copied to the output unchanged, unless it is matched by a template in the stylesheet.
Combine this with the fact that an empty template generates no output, and we have the tools we need to filter out elements from an XML document.
Let’s try this in practice. Here’s a sample document:
Let’s say I want to remove all the odd-numbered lines. To do so, I’ll write an empty template matching the
<line> elements, where the
number attribute is odd. It looks like this:
Combining this template with the identity transformation, we arrive at the following stylesheet:
(I also added the
xsl:output element, which specifies that I’m generating XML and that I would like the result to be nicely indented.)
Here’s the result of transforming our sample document:
The odd-numbered lines are indeed gone, but what’s up with those extra newlines? Well, they’re not really extra at all; they’ve been there all along. The template removed the odd-numbered
<line> elements, but not any surrounding whitespace.
Most of the time, that whitespace is not going to impact the processing of the result, but it does sort of ruin that nice indendation. To remove it, I’ll add the following top-level element to the stylesheet:
xsl:strip-space element instructs the XSLT processor to strip whitespace from inside the
<lines> element. Now, the stylesheet produces the following output:
No odd-numbered lines and no superfluous newlines. We’re done here.
And that’s it for the first post. If there is a particular XSLT topic, you’d like me to cover, please let me know in the comments below. I also welcome any and all feedback.
Until next time!
Thanks for sharing such a detail blog on XSLT.
Please let me know if my understanding is correct.
As per the example, you wanted to eliminate all the odd <lines> so we used template-match function @number mod 2 ==1 which actually skipped from copying it to output because the match was found.
Does this mean that if a template match is found it will be skipped in the output file?
For example: if I want the only line with odd numbers, i need to write a template-match function to skip even number line?
Am I right?
The important thing here is that the template is empty. That means the nodes in question are matched, but no output is generated. If, on the other hand, you wanted to generate some output for those <line> elements, you would do so inside that template.
Understood. Thanks for clearing my doubts.
I appreciate your effort.
Looking forward to your further blogs.
didn't think XSL-T has anything beginner friendly.
I'm impressed by your blog.
This is indeed a good example to get started with XSL-T.
The last years I had both hate and love for the language.
The crazy syntax and stubborn functionality can drive one mad, at times.
Yet it's performance is unmatched.
The speed at which you can search and transform huge XML-files is really impressive.
So everyone dealing with XML: pay attention and stay tuned!
Thanks a lot, Manfred! I'm glad you found it useful.
Damn it ?
I like the graphical mappings when we talk SAP PI/PO and had hoped that CPI could level that. However we are not there yet for graphical mappings on CPI compared to SAP PI/PO. So I agree, we need to look to Groovy or XSLT mappings, expect for the simplest of mappings where I still think the graphical mapping rules.
Until this article I would have said “go for Groovy”. However I must admit now I not so sure.
Let’s see if you can turn me into a believer with the rest of the series ?
I think there is a time and a place for all three mapping options. Sometimes graphical mapping is the right tool for the job.
Great blog Morten. Very pedagogical presentation of XSLT-features. You can very easily be a fan of XSLT and get addicted to your blog-posts. Keep them coming.
Thank you, Tonny 🙂
Very nice blog and a very good topic for the community. Really it is worth to explore the power of XSLT.
As I can see that you are using XSLT version 3.0. but some applications not used this 3.0 version as they works with version 1.0/2.0.
Please let me know if I can help you with version 1.0 or 2.0 as I am also big fan of XSLT.
Thanks, Ankit. I went with XSLT 3.0 because that's the version supported in CPI. But I'll keep in mind to highlight any differences between 3.0 and the previous versions.