Technical Articles
Thinking in XSLT: Filtering XML elements
I rarely miss an opportunity to mention how much I like XSLT š Of the three mapping options provided by SAP Cloud Platform Integration, it is by far my favourite one. There is one downside to the language, though, and that is its fairly steep learning curve.
XSLT is declarative by nature (like SQL for relational databases and Cypher for graphs); its approach is what, not how. At the core of that is writing templates to match nodes in the source document, and letting the XSLT processor take it from there.
This approach can take a bit of getting used to, though – especially if you have an imperative programming background. In this blog post series, Iāll give you some pointers on how to start thinking in XSLT. Fair warning: Posts will most likely be published at very irregular intervals š
In this first installment, Iāll be taking a closer look at an XSLT FAQ: How do I filter out elements from my XML document?
The answer to this one is actually quite straightforward. There are two steps to it:
- Start with the identity transformation
- Then create one or more empty templates matching the elements to be filtered out
The identity transformation is a stylesheet that generates XML output, which is identical to its XML input. This is what it looks like in XSLT 3.0 (which is the XSLT version supported by CPI):
That doesnāt sound very exciting, and indeed it wouldnāt be, if not for that on-no-match
attribute. The meaning of on-no-match="shallow-copy"
is that every input node will be copied to the output unchanged, unless it is matched by a template in the stylesheet.
Combine this with the fact that an empty template generates no output, and we have the tools we need to filter out elements from an XML document.
Letās try this in practice. Hereās a sample document:
Letās say I want to remove all the odd-numbered lines. To do so, Iāll write an empty template matching the <line>
elements, where the number
attribute is odd. It looks like this:
Combining this template with the identity transformation, we arrive at the following stylesheet:
(I also added the xsl:output
element, which specifies that Iām generating XML and that I would like the result to be nicely indented.)
Hereās the result of transforming our sample document:
The odd-numbered lines are indeed gone, but whatās up with those extra newlines? Well, theyāre not really extra at all; theyāve been there all along. The template removed the odd-numbered <line>
elements, but not any surrounding whitespace.
Most of the time, that whitespace is not going to impact the processing of the result, but it does sort of ruin that nice indendation. To remove it, Iāll add the following top-level element to the stylesheet:
The xsl:strip-space
element instructs the XSLT processor to strip whitespace from inside the <lines>
element. Now, the stylesheet produces the following output:
No odd-numbered lines and no superfluous newlines. Weāre done here.
And thatās it for the first post. If there is a particular XSLT topic, youād like me to cover, please let me know in the comments below. I also welcome any and all feedback.
Until next time!
Hello Morten,
Thanks for sharing such a detail blog on XSLT.
Please let me know if my understanding is correct.
As per the example, you wanted to eliminate all the odd <lines> so we used template-match function @number mod 2 ==1Ā which actually skipped from copying it to output because the match was found.
Does this mean that if a template match is found it will be skipped in the output file?
For example: if I want the only line with odd numbers, i need to write a template-match function to skip even number line?
Am I right?
Regards,
Khusal
Hi Khusal
The important thing here is that the template is empty. That means the nodes in question are matched, but no output is generated. If, on the other hand, you wanted to generate some output for those <line> elements, you would do so inside that template.
Regards,
Morten
Hello Morten,
Understood. Thanks for clearing my doubts.
I appreciate your effort.
Looking forward to your further blogs.
Hi Morten,
didn't think XSL-T has anything beginner friendly.
I'm impressed by your blog.
This is indeed a good example to get started with XSL-T.
The last years I had both hate and love for the language.
The crazy syntax and stubborn functionality can drive one mad, at times.
Yet it's performance is unmatched.
The speed at which you can search and transform huge XML-files is really impressive.
So everyone dealing with XML: pay attention and stay tuned!
Thanks a lot, Manfred! I'm glad you found it useful.
Regards,
Morten
Damn it ?
I like the graphical mappings when we talk SAP PI/PO and had hoped that CPI could level that. However we are not there yet for graphical mappings on CPI compared to SAP PI/PO. So I agree, we need to look to Groovy or XSLT mappings, expect for the simplest of mappings where I still think the graphical mapping rules.
Until this article I would have said āgo for Groovyā. However I must admit now I not so sure.
Letās see if you can turn me into a believer with the rest of the series ?
/Jesper
Ā
I think there is a time and a place for all three mapping options. Sometimes graphical mapping is the right tool for the job.
Regards,
Morten
Great blog Morten. Very pedagogical presentation of XSLT-features. You can very easily be a fan of XSLT and get addicted to your blog-posts. Keep them coming.
Tonny Franke
Thank you, Tonny š
Regards,
Morten
Hi Morten,
Very nice blog and a very good topic for the community. Really it is worth to explore the power of XSLT.
As I can see that you are using XSLT version 3.0. but some applications not used this 3.0 version as they works with version 1.0/2.0.
Ā Please let me know if I can help you with version 1.0 or 2.0 as I am also big fan of XSLT.
Regards,
Ankit Gupta
Thanks, Ankit. I went with XSLT 3.0 because that's the version supported in CPI. But I'll keep in mind to highlight any differences between 3.0 and the previous versions.
Regards,
Morten