Skip to Content

Grady Booch provided some interesting  perspectives on the future of databases in the context of SOA. Data is actually  a critical problem in SOA, we often talk about services that exchange messages  that contain “data” as business object representations (REASC: a pattern for constructing Composite Applications), but we rarely talk about data management in a Service Oriented Architecture.

The first major problem is that that all  business objects are related with each other (from ERP to CRM to SCM…) and  that is not going to change. Relational data models are well aligned with the  way humans think, represent, search, access and manipulate data, i.e. they use  keys. For instance, some acrostic forms of writings can be thought of as the  first indexing mechanism based on letters of the alphabet used as keys (dating  several thousand years B.C.), way before the Arab numbering system made it  trivial (albeit less poetic) to orient oneself in an arbitrary large document.

Why is it a problem? First services are  supposed to be “autonomous” entities, so if you create a “Purchase Order”  service and a “Customer Service”, they are not supposed to have message  exchanges behind the service interface, nor depend on the same database. Second, if your  services are truly autonomous, and say you want to retrieve the orders of a  customer via some PO service operation, how can you be sure that the PO was  recorded with the correct customer “key”, how (and where) do you create “views”  that join purchase order and customer data? how do you deal with concurrency  issues to accept the purchase order if validation rules require attributes from  the customer object? …

Business logic is also a major problem, I  would recommend that you read Maarten Mullander’s paper “CRUD,  Only when you can afford it” (CRUD = Create, Read, Update, Delete) to  further understand this issues. I am just reproducing here a small excerpt:  “…, most order processing is not CRUD, or at least not according to  my definition. For example, an order can be created offline and then sent  (replicated if you will) to a service for processing. Processing of that order  will affect many of the related entities. The service may update the customer  information, potentially changing more than just the year-to-date totals. For  instance, the customer might have reached the critical order mass and be  upgraded, updating properties used for price and discount calculations; products  may or may not be available; delivery dates may or may not have been realistic;  and so forth. These changes are important to both parties, but with CRUD, the  customer’s copy of the order would not reflect them.”  So it is unlikely that you will be able to cast service interfaces right at the  outskirt of the database. To illustrate this point, you might want to read my  previous post on “The fundamental problem solved by ESA“.

Privacy is another issue: when Data is  captured and consumed within the boundaries of an application, we don’t have to  worry too much about someone “forwarding” it to another consumer, breaking  privacy rules that we had agreed ahead of time. In an automated world, it might  not be so clear to a particular service that the data it got hold off cannot be  sent to another service provider.

Database connections are yet another issue: a  service consumer cannot acquire a database “connection”, this means that for every message  received from a consumer, the provider has to authenticate, authorize and  possibly resume the usage of an existing connection or open a new connection  altogether.

But let’s go back to Grady’s recommendations

         

  • Database management systems will have to support           ACID SOA calls.      
                

    • I am not sure what is an “SOA call”            onto a DBMS? I assume Grady means that operation invocation will have to            exhibit ACID properties, now are these operation invocation directed to            the same database? to a federation of databases? Would databases become            service providers (and consumers)? In general, out of the ACID            properties, Isolation is the hardest to achieve in SOA because resources            would often need to be lock for long periods of time, making them            inaccessible to other service consumers.
      Clearly, we might see in the future            that database vendors support the WS-Transaction specification natively.
    •      

         

  •      

  • Applications will exploit multiple data repositories.       
                

    • Yes, these are called composite            applications.
    •      

         

  •      

  • Careful attention to authentication and security will be needed.
                

    • Yes, based on my comment above, this            is a relatively complex problem to deal with when related to database            connections.
    •      

         

  •      

  • Distributed two-phase commit will be avoided by recoverable       messaging to applications (via services) that consult and modify the       database and send a recoverable reply.
                

    • This statement seems to be in            contraction with the first statement. 2PC cannot be used in “long-running” scenarios because it relies on resource locking.            
    •      

         

  •      

  • Database size will become a non-issue.      
                

    • Based on the cost of hardware this            would be true (I have over a terabyte of storage at home (for my PVR)            which today costs about $400). We have to be a little more careful            though because we really have to relate the size of the database to the            performance of accessing and fetching records.
    •      

         

  •      

  • We’ll see lots of low-latency asynchronous replication of reference       data among databases serving various applications and their associated       service interfaces.
                

    • Yes this is I think the most            important and the most valid point that Grady is making, as well as a            major difference between composite and monolithic applications. If we go            back to our Purchaser Order – Customer example, it is likely that            customer data such as address, phone number, contact information, will            be stored in the Purchase Order service to avoid costly joins. At that            point, replication mechanisms will be needed.
    •      

         

Problems relative to Data in a Service  Oriented Architecture are large and complex: they span the conceptual, logical  and physical levels of an SOA. We have just begun scratching the surface of the  relationship between business object, data, databases and SOA. Issues such as  identity, data federation, replication, transaction, privacy, explicit state  management,… will all have to be addressed.

The Service Data Object standard will prove to  be foundation of not just as a data programming model enabling federated data  result sets, but in later versions, as a central vehicle enabling a secure, private, atomic and  consistent flow of data in service oriented architectures.

To report this post you need to login first.

Be the first to leave a comment

You must be Logged on to comment or reply to a post.

Leave a Reply