Is Validation against W3C XML Schema necessary?
We do data exchange to link electronic business processes. Therefore, we have to model the documents we want to accept in a certain scenario. There are certain standards for modelling those documents in so called schema languages and there are standard tools for validation against those languages. The most widely used standard is W3C XML Schema. Unfortunately without Java integration we can only validate against Document Type Definitions using XML Library.
In data exchange scenarios with lots of external partners validation may be unavoidable. On the other hand let me mention a few applications in which validation is quite useful but Java integration is not necessary:
- A developer describes rules in an XML based domain specific language and generates ABAP code.
- developer uses XML based configuration files for ABAP applications.
- We deserialize ABAP objects.
If you already have a framework for Java validation you should use it. If you have not I will describe an alternative more “lightweight” solution based on XSLT. To be more precise we use The Schematron: an XML structure validation Language using Patterns in Trees. Below you see the Schematron mascot: Schematroll.
Using Schematron as an XSLT based Schema Compiler
The idea is quite simple: You can use a transformation language like XSLT to perform validation. In fact in an earlier weblog I invented an self made validation language to generate STX programs. Using this approach you can develop very powerful validation that is suitable for mass data but running under Java, too.
In ABAP you can use Schematron as schema language. There are a lot of Schematron implementations in XSLT that create validation programs in XSLT that you can run using CALL TRANSFORMATION:
In this weblog we take skeleton1-5.xsl as schema Compiler. Before I describe a Schematron example let me mention what problems may occur.
There is a Schematron 1.5 implementation based on XSLT 1.0 but unfortunately is uses a command that is not supported by the ABAP XSLT processor: <xsl:namespace-alias stylesheet-prefix=”axsl” result-prefix=”xsl”/>. This command is useful in the case of an XSLT program generating XSLT. In fact it would be possible to replace this command by using the command xsl:element. Unfortunately it is not trivial to generate and activate XSLT programs under ABAP so I suggest you to do it manually: Choose an XSLT processor like SAXON for example and run the schema compilation. Then create an XSLT program simply by copy & paste.
But there is a second difficulty: Are the XSLT programs generated by the Schema Compiler running under ABAP? Fortunately this is true because there are only a few commands besides xsl:namespace-alias that are not supported and the Schematron compiler creates none of them.
But this is one thing we have to remember if we write Schematron schema: We can’t use the XPath lang() function in our schemas because this is not supported, too.
A Schematron Example
Schematron 1.5 is a rule based schema language. That means that we write assertions and rules for certain XML elements using XPath expressions. They will be evaluated at run time and can create error messages. If no error occurs the document is valid.
Let met present a slightly simplified example. The following XML document describes a test case. In fact I created a test tool that interpretes these documents to perform a test:
\ <TESTCASE xmlns="urn:mine" version="1.0"> \ <NAME>Test for Change Documents</NAME> \ <COMMENT>Document for two transparent tables</COMMENT> \ <CALL funcname="Z_TST_001" runtime="yes" /> \ <SAVE tablename="Z_HM01K" \ transformation="Z_NORMALIZE_CHGTIME" diff="all"/> \ </TESTCASE> \
The syntax is quite clear: A test case has a name, we can comment it. We can call function modules and save the content of transparent tables for comparing the results.
The XML document above is valid according following W3C XML Schema:
Here is the source of the Schema:
Now I present a Schematron document that is equivalent to the schema above:
<schema xmlns="http://www.ascc.net/xml/schematron" version="1.0" > \ <ns prefix="tb" uri="urn:mine"/> \ <pattern name="Root Element"> \ <rule context="/*"> \ <assert test="name() = 'TESTCASE'">Wrong root element.</assert> \ </rule> \ </pattern> \ <pattern name="Elements"> \ <rule context="*"> \ <assert test="name() = 'NAME' or name() = 'CALL' or \ name() = 'COMMENT' or name() = 'TESTCASE' or \ name() = 'SAVE'">Wrong element <name/> \ </assert> \ <assert test="count(ancestor::*) < 2">Hierarchy error in element \ <name/>. \ </assert> \ <assert test="count(tb:*) = count(*)">Wrong namespace.</assert> \ </rule> \ </pattern> \ <pattern name="Commands"> \ <rule context="tb:CALL"> \ <assert test="@*[name() = 'funcname']"> \ Attribute funcname missing.</assert> \ <assert test="count(@*[name() = 'runtime' and string() != 'yes']) \ = 0">Wrong value of attribute runtime.</assert> \ <assert test="count(@*[name() != 'funcname' and \ name() != 'runtime']) = 0">Wrong attribute.</assert> \ </rule> \ <rule context="tb:SAVE"> \ <assert test="@*[name() = 'tablename']"> \ Attribute tablename missing.</assert> \ <assert test="count(@*[name() != 'tablename' and \ name() != 'transformation' and name() != 'diff']) = 0"> \ Wrong attribute.</assert> \ <assert test="count(@*[name() = 'diff' and (string() != 'all' and \ string() != 'new')]) = 0"> \ Wrong value of attribute tablename.</assert> \ </rule> \ </pattern> \ </schema>
Let’s get into detail. The schema consists of a set of patterns. Each pattern consists of rules that are specify certain assertions in the context of a certain element. In the following example the pattern named Elements defines two assertions. The first assertion means that possible elements in a valid schema are NAME, CALL, COMMENT and SAVE: The second assertion says that every non-root element must be a child of the root-element. The third assertion says that every (non-root) element must have a certain namespace:
\ <pattern name="Elements"> \ <rule context="*"> \ <assert test="name() = 'NAME' or name() = 'CALL' or \ name() = 'COMMENT' or name() = 'TESTCASE' or \ name() = 'SAVE'">Wrong element <name/> \ </assert> \ <assert test="count(ancestor::*) < 2">Hierarchy error in element \ <name/>. \ </assert> \ <assert test="count(tb:*) = count(*)">Wrong namespace.</assert> \ </rule> \ </pattern> \
Let analyse some rules for the element tb:CALL. The first assertion defines an XPath expression that is true if and only if there is an attribute funcname. The second assertion means that there is no attribute runtime that has a value not equal ‘yes’. The third rule means that there is no attribute other than funcname and runtime:
\ <rule context="tb:CALL"> \ <assert test="@*[name() = 'funcname']"> \ Attribute funcname missing.</assert> \ <assert test="count(@*[name() = 'runtime' and string() != 'yes']) \ = 0">Wrong value of attribute runtime.</assert> \ <assert test="count(@*[name() != 'funcname' and \ name() != 'runtime']) = 0">Wrong attribute.</assert> \ </rule> \
The rest is quite simple: Apply skeleton1-5.xsl to the schema, and create an XSLT program with the output.
In this weblog I presented Schematron – a schema language invented by Rick Jelliffe. You can use an XML based Schematron implementation to create XSLT programs that you can run under ABAP using the CALL TRANSFORMATION command.
Because Schematon uses XPath for specifiying assertions it is far stronger in terms of expressiveness compared to W3C XML Schema. In fact Schematron can be used to overcome problems of W3C XML Schema: In Expressiveness and Complexity of XML Schema the authors discuss the paradigm of contextual patterns to overcome typing problems and duplication of definitions. In fact there are even possibilities to combine W3C XML Schema and Schematron but this beyond the scope of this blog because this was only an introduction into that schema language. In fact Schematron 1.5 has 19 commands and is very powerful. Unfortunately, most implementations are based on XSLT so they have problems with huge XML documents.
If you have to deal with mass data there are two possibilities:
- ISO Schematron allows different query language (see Query Language Binding in the spec). So we could use STX to perform queries.
- invented an own schema language similar to Schematron but based on STX. In fact I already wrote a weblog about it.
In fact I like eclectic approaches and I think sometimes we have to combine the strengths of different XML based techniques.