In the Semantic Web Technologies Part 1 – SPARQL of this weblog series I discussed two Semantic Web technologies: RDF and SPARQL. Usually those techniques are used
- in the area of internet technologies and information retrieval
- knowledge management and
- semantic data modeling (see here or there for example) and
- perhaps in SOA subdomains lile semantic web service discovery.
In my opinion there are even more use cases within the SAP Netweaver platform and SAP Business Suite. One the one hand they can help us in technical scenarios to optimize data integration by adding semantic units that can be interpreted by machines. In this blog entry I take data exchange and monitoring as examples.
The second aspect is even more important in my opinion because using Semantic Web technology we are able to introduce the aspect of knowledge management to business processes. As an example I discuss Records Management and contact information in CRM scenarios.
In most data exchange and integration projects I did so far the so called “header information” of an XML document doesn’t suffice the customers’ needs. Let me give an example: There are output management scenarios using external output management systems that work on formats like XSF or XDF created by SAP Smart Forms for example. In those scenarios you need additional meta data, think of a GUID that is used to identify a document. Without a GUID you won’t be able to identify a document to do monitoring. How do we solve those problems today? In the SAP Libarary you can find an example of an XDF document. You don’t have to understand it in detail but let me mention that first 22 lines of the header contain archive, mail and some output options.
This not not much and usually we need more. In a customer scenario I had to add following data to the output data:
- I want to add a GUID to identify the output stream for monitoring reasons.
- I need information about the specific system that produced the output.
- More information for the archiving process: to which business objects should it be linked in an ArchiveLink or Records Management scenario?
- I want to transmit data that can be interpreted by the specific output management system.
Changing the XDF header would mean to change the XDF standard having following consequences:
- The standard header gets bigger and bigger containing mostly empty elements.
- There is still no mechanism to include proprietary vocabulary (think of a certain output management system).
- It is still not extensible for customers.
The easiest way to solve this problem would be to define a new data type in the user specific data that contains these information. But this has following drawbacks:
- We have to put the meta data in a container for data to print.
- We are not able to include vendor specific of the used output management system within the data stream.
It is easy to design extensible XML based standards that allow that arbitrary RDF-data can be embedded (just include an xs:any namespace=”http://www.w3.org/1999/02/22-rdf-syntax-ns# element in the defining schema. Then you can add RDF/XML metadata to the XML documents to store process specific data for you customer scenarios. I think it would be very helpful if an API would support the possibility to enrich output with customer specific metadata. Using such an API we could enrich XML data with
- customer specific data that are needed in a special customer process as well as
- vendor specific data that can be interpreted by the customers’ output management or document management system.
This is nothing new: Adobe offers tools to add RDF data to all their file formats using XMP: http://www.adobe.com/products/xmp/.
System landscapes require a monitoring infrastructure to help administrators to do their job efficiently. Servers and server components must be able to return a “health status” so that a administrator can be informed if a systems gets into trouble. Those information could be express using RSS feeds or spreaded using mails or enterprise message systems like ESME: http://blog.esme.us/ .
I think it would be useful if there would be a common vocabulary a system can use to give information about its service status. Of course every system can has its specific and proprietary vocabulary to code additional status information. In my opinion RDF would be an appropriate standard to unify those vocabulary.
After collecting all those RDF status information of a system landscape we could use query languages like SPARQL to evaluate the data. So SPARQL could be a candidate for a “rule engine” that aggregates status information, evaluates them and gives administrators an advice perhaps after doing reasoning about the determined facts.
With SAP Records Management SAP offers a solution for records and case management. Customers can create data models and define categories to classify documents within models. Based on records user can start workflows so that complex business and document processes are supported.
Technically records are XML documents that contain pointers to documents in XML elements: The recordElement element contains a part of a record and the element recordElementPointer contains link to archived objects, business objects, URLs and even more. So the record is only a wrapper and the customer adapts it to its needs in an implementation project and defines a records model using the Records Modeler or service providers for his own document types. Records Management has an API so we can add attributes to documents (f.e. using the KPro interface).
Adding (even customer specific) facts to records or links (to documents within the records) in XML/RDF is a quite natural approach. In my opinion this is more powerful compared to the techniques above because
- we can used standardized vocabularies with defined semantics,
- the semantics are much more expressive (think of RDF reification),
- we can do queries and use reasoners.
But there are even more possibilities: Most document formats that are used within records support imbedded RDF data (think of RDFa in HTML/XML, above mentioned XMP in PDF and so on). The Records Management API could scan the linked documents to extract the RDF data within and so (parts of) the documents are readable for machines. For example we could evaluate the imbedded facts of outgoing documents in the above mentioned output management format.
Customer and Contact Information
In a CRM scenario we want to save information about a certain customer – this vocabulary could be defined by SAP or defined by the customer to deserve the needs of a special requirement. These information could be saved by a callcenter agent as additional data of a contact and displayed in the case of a following contact with the same customer in an inbound or outbound scenario. These information could be evaluated in business intelligence scenarios, too.
An interesting fact are possible synergy effects to a Semantic Records Management above. Today a lot of customers automatically create CRM contacts from outgoing correspondence. As a consequence you have all the information within the CIC0 but with no semantic – it is a pile of garbage in my opinion. It would be even more interesting to access the information stored in a Semantic Records Management and visualize them that a callcenter agent has all the information he needs quickly as well as the possibility to drill down to a specific resource.
Technically speaking above mentioned scenarios above have the following in common:
- Adding metadata to documents makes it easy to use them in automatic business processes.
- RDF vocabularies are generic: customers can define their own vocabularies due to their special needs.
- In the case out outgoing data we can support a vendor specific vocabulary.
- Metadata can be used to locate documents and query their content.
- The result of those queries can be used by rule engines that control processes or start following processes.
- If Business Objects are internally represented as XML documents we have an natural and generic extension mechanism for customer needs.
So Semantic Web technologies to do better information integration and can introduce the aspect of knowledge management to traditional ERP and CRM processes. Another application area are business networks: eBusiness, electronic marketplaces and so on. But this is a topic for another blog entry.
I think this would be a logical step in the development of the SAP Netweaver Platform and ERP:
- At first XML technology was introduced to support B2B & EAI and to enable working on semistructured data like records.
- Then those technology was used to do Service Orientation and to build and integrate Web 2.0 applications.
- Now Semantic Web technology could be used either to improve existing XML based solutions as well as to building a bridge to knowledge management and automated reasoning.