Enter Semantics: Interface Descriptions of Inside-Out Scenarios
Today’s enterprise architecture is necessarily heterogeneous. Often it has core systems which support the main processes: enterprise resource planning, customer relationship management and so on. But every enterprise architecture also has highly specialized systems and we have to integrate them using documents containing structured data:
- We do data replication using IDocs and BDocs.
- We call Enterprise Services to enable client systems to start business processes in the core systems.
These are only a few examples but there many more cases in which we interchange business documents. In most cases we are using XML as syntax for business documents but we are not restricted to XML. XML has important advantages:
- We can use schema languages to describe the business document and
- there are transformation languages that we can use to translate the documents into another syntax.
Languages like XML Schema make it easy to describe business documents in a formal way: we can use a set of predefined data types and also build our data types from them: think of strings having certain properties like likes length or a special content or code lists defined by enumeration types. We can define complex data types by defining sets and hierarchies of the first mentioned simple data types to built up business documents.
h3. What are Challenges of System Integration?
At first we have to enable data exchange between system: we have to set up the infrastructure (file transfer, enterprise service bus…) and then we defines company we define company standards to make the systems of an enterprise architecture ready for data exchange). But then the trouble begins: we have to define transformations which require knowledge about data structures to exchange. The main problem is that most interfaces are difficult to understand:
- For many XML-based interfaces like XSF, XFP data containers there are no metadata provided. Of course standards like XSF and XFP are described by schemas but a dedicated Adobe Form doesn’t provide metadata about its XFP-context.
- The situation for IDoc / BDoc structures is much better because we can generate XML Schema documents from them.
- Web Services created “Inside-Out” from function modules have a WSDL description but there are many disadvantages: the WSDL defined data types have no descriptions, there is no possibility to declare elements as optional and the basic data types are canonical ABAP data types without detailed description (think of enumerations for example).
For Enterprise Services created Outside-In the situation is much better because we have a modeling tool which we use to create formal definitions of the service in WSDL and XML Schema. There are modeling guidelines that ensure precise semantics of Web Services .
Formal description are necessary to understand complex interfaces. As a consequence every software tool that creates interfaces “Inside-Out” (IDoc, BDoc, XSF-Smart Form, XFP Adobe Form and so on) should be should be able to describe the structures it accepts or produces in a formal way.
Unfortunately formal description doesn’t solve integration problems especially because in integration projects we have to discover the meaning of the formal described structure to implement interfaces, create transformations and so on. So knowledge management and documentation is one important factor of success in integration techniques.
Documentation can be done in many ways: word documents , spreadsheets, wikis and so on. But having an XML Schema document it is an interesting option to use the schema document as central source of information and – if necessary – generate other documentation from the schema document using XSLT programs. So let’s have a look at XML Schema to find out whether it deserve our needs.
h3. What is XML Schema?
XML Schema is a W3C standard to define classes of XML documents such as the structure of admissible payloads of SOAP web services. XML schema is a very complicated standard which allows to define
- “simple” data types in many ways: string patterns, enumerations…
- the structure of business documents i.e. hierarchy, order of elements and repetitions (is a certain element optional? does it have to occur a certain number of times?)
- certain types of constraints that help you to ensure that the content of certain elements fulfill the property of a unique index so that the values can be stored in a database table.
As a slightly simplified rule of thumb we can say that XML Schema can describe the structure of an XML document by parsing with lookahead one. This makes it possible that a validator can check whether an obligatory element occurs and give out an error if not.
XML Schema also supports annotations so we can add information to make a schema more readable and understandable. We can even separate annotations using XML namespaces.
If we can solve the problem of merging an updated (but not annotated) XML Schema document created after a new release of an interface with the annotations of the previous document version we have a robust way of enriching formal specifications with comments.
h3. Is XML Schema sufficient to describe Business Data?
Of course it is – it is widely adopted standard. But you should know the limitations of the standard: as I mentioned before there are constraints that cannot be defined using XML Schema:
- conditions that define that a specific optional element is obligatory
- conditions that define that a certain element contains values that are a strict subset of all possible domain values.
These conditions occur very often because the interfaces are huge and very generic. This has the consequence that an XML Schema definition says that most elements are optional or can contain initial values. There are lot of additional constraints so that the XML Schema definition is nearly irrelevant. A good example are custom created or enhanced IDocs or BDocs. In many cases there are some attributes whose values affect huge parts of the document: the value of one element determines which optional elements are in fact obligatory and have to contain values, which values are allowed and so on.
In my opinion above mentioned interfaces unveil serious design errors and we should try to avoid them. If we can’t avoid them we should set up error processes so that we can deal with incorrect data. But from my experience most business processes require data that are valid to certain business rules and most of them can’t be expressed using XML Schema.
h3. How to deal with Business Rules in Integration Scenarios
Some people propose much stronger schema languages like Schematron which can be useful in B2B processes with many participants. In fact there are ways to include Schematron into XML Schema but adding a complex standard to another very complex standard is critical.
In A2A processes it makes more sense not to use even more complex schema languages and try to describe constraints that can’t be expressed with XML Schema in narrative as well as formal way. Formal description have following advantage:
- we can do validation of XML documents i.e. checks against a formal specification
- we can retrieve information much more efficiently compared to reading large documents
From my experience most integration projects with Inside-Out approaches have the problem that there is no formal description of the interfaces. Most SAP standard interfaces (for example HR, FI or BP IDocs, EDI processes in SD, Inside-Out Web Services created from BAPIs) are well understood and even if you enhance them you will master A2A processes. But this is not true for non-standard interfaces like custom built interfaces or printing processes with external output management systems. In many of those cases there is no formal specification of the interface (think of XML Schema) nor precise documentation.
In my opinion SAP should enable all Inside-Out” interfaces so that can expose metadata perhaps as XML Schema. This helps a lot in integration scenarios but is not a “silver bullet”:
- Often these metadata will have poor quality and need postprocessing.
- People who implement an interface need the possibility to add even more annotations and formal descriptions to understand the semantic of the Schema and perhaps even some “hidden” constraints and business rules.
In the next entry of this blog series I will discuss how metadata (besides above mentioned annotations) can help:
- The W3C standard SAWSDL that allows to link content of an schema to an ontology,
- expression of business rules using ontologies
- ontological approaches for validation
h3. Appendix: How to expose the Structure of Inside-Out Interfaces as XML Schema
For IDoc this is well supported using transaction WE60. For Adobe XFP interfaces and Smart Forms XSF interfaces there is no possibility to create schema documents. Adobe Forms can be schema based but only for “internal” interfaces to WD4A (look at class CL_FP_XSD_SCHEMA and report QISR_DDIC2XML_INTERFACE). For XFD the situation is better because it is a transformation into canonical XML. If you have an example output you can generate an XML Schema from it with many XML Schema editor. In case of XSF this is usually impossible because of its attribute-oriented syntax like