Skip to Content

Alright, guess it’s time to really kick it up a notch and start making bigger changes. Where to start? Well, let’s fix the domain model first as this is the basis for everything else and the original model was – politely speaking – rudimentary at best. I did not spend too much time designing the domain model and hence we may want to further improve it along the way, yet that’s a very common practice and every good developer should understand that continuous refactoring is a necessity and best practice. The key take-away here is that your initial model needs to be both solid and flexible to ease later refactoring!

So, without further ado, let’s have a look at Granny’s Next Topmodel…

Domain model objects

/wp-content/uploads/2013/07/enterprise_granny_domain_model_241371.png

The new model is slightly more complex and comprises the following objects:

As done in the previous chapters I took the opportunity to deprecate the old Address object and move the domain model into a new package called: com.sap.hana.cloud.samples.granny.model.

BaseObject

It’s always a good practice to define a superclass for entities right away and you should make that a routine. If you do not have a concrete use-case for it right away, do it nevertheless and flag it “reserved for future use”. Trust me, just do it and thank me later (during the next refactoring phase!) One thing I want to state upfront is that some of the concepts illustrated in the BaseObject may look a bit over-engineered given the simple Addressbook applications we are developing. However, many of the topics I’ll address here are very important when developing more complex applications, so I really wanted to highlight them and demonstrate best practices. 

First thing to notice is that we added the @MappedSuperclass annotation to the BaseObject indicating that all its mapping information are applied to the entities that inherit from it. Please also note that a mapped superclass has no separate table defined for it, but instead the properties are stored in corresponding columns in the inheriting class. This is actually quite handy, because if we would have chosen a different inheritance mapping strategy it would require a join, which would negatively impact performance. (Note: For a good overview of the pros and cons of different inheritance mapping strategies consult this documentation.)

The first attribute to discuss is the ID property, which is the primary key of every entity and hence annotated with @Id. Here I opted against using a numeric value (as done originally), but instead favored to switch to an UUID approach. There are good reasons for both approaches and I remember discussing that very topic numerous times throughout my career. At the end of the day there’s no strict rule and hence it may be worth discussing it with your development team prior to making a choice for either one! The main reason why I decided to use UUIDs was basically to show you options. The original solution used a numeric primary key automatically issued by the DB via a dedicated Sequence table. In the new model we use UUID we handle ourselves and which are generated using the UUID.randomUUID() functionality.

Next thing to discuss are the four attributes added for auditing information: createdAt, lastModifiedAt, createdBy and lastModifiedBy. It’s always a good idea to introduce them and in many projects such auditing information are explicitly needed to adhere to legal requirements.

The last attribute is the Version property needed for optimistic locking. Especially in scenarios that are targeting a huge user base (not atypical for cloud apps) you need to establish a mechanism that caters to concurrent modification of a single object. By adding this attribute and annotating it with @Version you can easily achieve that and leave all the heavy lifting to JPA.

(Note: Please also note the explicit @Column annotations on each attribute stating the column-name to be used and specifying the concrete length of character-based attributes.)

Another aspect of the BaseObject worth discussing are the two life-cycle event callback operations: updateAuditInformation() and generateAuditInformation() respectively. The first one is annotated with @PreUpdate and it’s called whenever an entity (inheriting BaseObject!) is updated. We usethis to update the lastModifiedAt and lastModifiedBy attributes of the audit information. The generateAuditInformation() method is annotated with @PrePersist and it’s called just prior to creating an object. Here, we generate the ID of the entity and set the initial audit information.

(Note: At this point in time we do not set the user name in the audit information, because we do not have implemented any authentication nor authorization concept yet. We’ll take care of this a bit later and hence have marked the corresponding passage with a TODO flag.)

Contact

The Contact object is the main entity of our model and nothing too fancy. You see the attributes you’d probably expect like firstName, lastName etc. It also declares relations to other objects such as Address, EmailAddress and PhoneNumber, which have all been declared as @OneToMany associations mapped via a @JoinColumn annotation. This tells JPA to add a column to the corresponding tables of the referenced entities using the primary key of the Contact object as the foreign key. I have declared eager fetching and full cascading as all relations are only valid in the context of the entity that references them and if we deal with a Contact object we want all the related data to be read right away.

So, most of the above are really just JPA basics, however there’s a topic I’d like to elaborate on, which is the usage of enumerations. As you can see I have defined various enumerations like Salutation and Title etc. In general, you can map enumerations to their respective DB values in two ways: either by using their code or their name. I used the second option and declared that behavior via the @Enumerated(EnumType.STRING) statement. While storing enum values as code is much more efficient from a database perspective and requires less DB space, yet this approach comes with two major draw-backs in my opinion:

  1. it’s harder to debug as codes are not self-explanatory when looking at them and
  2. the code value depends on the order within the enumeration. In case, someone would introduce a new enum attribute to any of these enumerations and does not add them at the end all your data will be corrupted! So, better be safe than sorry and given teh compression capabilities of modern DBs I’m willing to prefer data integrity to DB space efficiency. (One last remark: remember the length constraint specified for the corresponding column mapped to an enum and make sure that newly introduced values adhere to that length restriction!)

Actually, enumerations are a great example highlighting one very positive aspect of cloud solutions. In the past, I’d have probably been required to construct a more complex pattern to provide means to extend the enumeration over time. Yet, developing for the cloud, we are always in full control of the app and we can easily update it as needed. If the need for a new enum value should arise we can simply add it to the enumeration (plus add the corresponding I18N keys) and redeploy the solution. Problem solved! As such, cloud applications help reduce complexity in software and keep things simple due to the fact that they are developed, maintained and operated by a single organisation.

Address, EmailAddress and PhoneNumber

Nothing really worth mentioning about these entities, it’s all just plain vanilla JPA. So, we skip to the next topic. 

Configuration Updates

Of course, we added all entities (including the BaseObject) to the persistence.xml configuration file. We also updated the spring configuration modifying the package declarations for both the <context:component-scan> and the <jpa:repository>. Speaking of which… if you look at the modifications done you’ll realize I introduced yet another package called: com.sap.hana.cloud.samples.granny.dao. Let’s discuss that next.

Data Access Objects (DAOs)

Call me old-fashioned (or worse!), but based on my experiences it still pays off to adhere to classical patterns and one of them is to decouple the business logic from data management. For this purpose it used to be a best practice to introduce a dedicated DAO layer. The motivation is simple, decouple the concerns of data retrieval/manipulation from business logic. In software development one of the most valuable goods is inherent flexibility and the famous quote (mis-) attributed to Darwin about survival of the fittest should be your guiding motto when developing enterprise applications:

“It is not the strongest of the species that survives, nor the most intelligent, but rather the one most adaptable to change.”

Replace ‘species’ with ‘software’ and link it to TCO considerations and you get the idea. As such, yes… I’d always introduce an additional (logical) layer and trade in the additional level of indirection for the sake of flexibility. By decoupling the JPA specifics from the service implementations we could easily switch the data repository (by developing a new DAO implementation) without the need to change much else. All in all, it’s the idea of separation of concerns being applied.

Loose ends

With that I conclude, yet let me state that we are not done yet with neither the data model, nor with the persistence aspects of the application. Matter of fact, I have omitted some very important basic aspects in the current implementation of the domain model.

Just for fun, to gauge interest and to see if anyone is actually following this series I’d be happy to read some comments about what I may have left out so far irt the data model from a technical perspective. (No, I’m not talking about unit tests just yet, we’ll cover that very soon!) So, if you got an idea I’d be happy to hear about it!

Have a great weekend and fun coding! TTYL…

Note: If you should be new to JPA development or looking for a great resource to look-up things I can strongly recommend the following website: http://en.wikibooks.org/wiki/Java_Persistence
To report this post you need to login first.

17 Comments

You must be Logged on to comment or reply to a post.

  1. Chris Paine

    I’m getting my fix of Enterprise Granny in spades tonight! 🙂

    Having used both DB generated integers and UUIDs for DB keys, I definitely prefer the latter. Mainly due to the hassle of returning the generated db id after it has been created – if you specify the “key” yourself it’s a hell of a lot easier in the code – as you go into the DB handling already knowing the ID of the object. And I think means that you have a little bit of safety if you start doing funky things like sharding your db access.

    It’s probably that using your persistence frameworks JPA that bit of work is already done for you. But having coded the DOAs in directly, it was definitely extra work to then update the object with the generated id.

    At the moment I’m struggling to add a couple more tables and relationship into my non-JPA create HCP application… very tempted to rewrite/refactor to use JPA

    As for your comment. The obvious omission is data history (unless you will automatically be updating version on every change…?). If someone goes in, hacks your authentication and changes all Granny’s addresses to be the local pizza store, it would be nice to have some way to roll-back the changes to an earlier version, plus provide an audit trail.

    It’s nice to be able to see who updated the latest value, but it’s also nice to see what the value was before that, and who updated it. Auditors in the enterprise space are a picky bunch, and if you can give them a complete history of every field in the DB who changed it and when, (preferable printed out on enough paper to sequester about a tonne of CO2) then they leave you alone.

    Keep ’em coming,

    Cheers,

    Chris

    (0) 
    1. Matthias Steiner Post author

      Thanks Chris for sharing your experiences with UUIDs for DB keys. I have made only positive experiences with that approach as well, yet most open APIs seem to prefer numeric values. As I said, guess there’s no right/wrong, only options and choices.

      Regarding the rewrite using JPA … or should I call it ‘refactoring‘. Well, it should be feirly easy to do and the longer you wait the harder it will get. You can always just fork/branch off the original code base and give it a try. Those are usually the projects that really reveal flaws in the original application (if any) and ultimately make you a better architect.

      Now off to our little quiz: agree, data history/audit trails would be a great addition to the application. Actually, we could make this as some sort of exercise to see how-to best implement it given all the techniques we covered so far.

      However, it is actually a lot more advanced than the thing I had in mind. I’m really talking about the very basics of plain old Java objects and a key aspect of ORM 😉

      One last remark regarding your hacking scenario… well, fortunately the platform periodically creates a backup to prevent massive data loss as you described above.

      Cheers,

      Matthias

      (0) 
      1. Chris Paine

        I’ve no idea if I’m getting any closer, or even if this is possible using JPA, but I like the constructors  and setter/getters of my data models to insist on the relationships that they model.

        Whilst you haven’t specified that a contact must have an address, or a even that a given address must exist in a contact (although why you’d keep one that wasn’t part of a contact I’m not sure) it seems that the id (key field inherited from BaseObject ) is mandatory. I’d therefore create a constructor method for BaseObject such that the Id is required to instantiate it, or is generated at instantiation if not supplied (not just generate it on persistence) and remove the public setter method.

        But again, not sure if that is good practice or compatible with JPA.

        How does one get the reference to the Contact that a given EmailAddress object belongs to?

        (might be worth updating the notation on the EmailAddress object too – STREET not a very descriptive name, and many emails longer than 50 chars 😉 )

        (0) 
        1. Matthias Steiner Post author

          You raise some very good topics again Chris!

          I haven’t been too restrictive in regards to the data model and none of the relations is mandatory at all. I think that’s an adequate representation of the reality: sometimes you only store an email address, sometimes only a phone number etc…

          Regarding the getter/setter methods and especially for the ID this is a very valid question. The way it has been designed right now, the BaseObject is making sure of issuing a valid UUID at the time of persisting the data for the first time (via the corresponding life-cycle event callback). This ensures that we’ll always have a valid ID. And yes, it would make sense to not make the setter of the ID property public as we do not want/expect anyone to mess with this property at all! But (and of course, there’s a but in this) we’ll also use the domain model as our DTO (Data Transfer Object) to safe us the hassle of generating a designated DTO model and having the need to copy back/forth all the time (MOVE-CORRESPONDING anyone?). As such we need to serialized/deserialize into various forms (JSON/XML) and for these frameworks it’s the most simple approach to have public getter/setter methods and a public no.argument constructor. So, it’s a a trade-off for sure…

          My rationale is that I’d rather keep things simple and given the use-case of a cloud application I’d argue that the internal API is only used by developers who should know not to manually overwrite or re-create IDs via the public setter…

          How does one get the reference to the Contact that a given EmailAddress object belongs to?

          I modelled it as a uni-directional relation on purpose. The way it is designed right now all the relations are always fetched with the overarching Contact object and hence we should never be in the situation to have an EmailAddress object disjunct from a Contact.

          (might be worth updating the notation on the EmailAddress object too – STREET not a very descriptive name, and many emails longer than 50 chars 😉 )                  

          Haha, touché! Thanks for spotting… classical copy&error mistake. Fixed.

          (0) 
          1. Chris Paine

            I’m not sure about the unidirectional interface. I would certainly for see a future where granny would like to see all her contacts based in say Australia, for example. Whilst the db must store some key to relate the two which could be used in a future query, it seems strange not to reciprocate that linkage in the data model. Or perhaps you will have some surprises for us when you come to implement the fuzzy searches with HANA.

            I’ve often handcrafted a getAsJSON method for my data objects. In retrospect it’s unmaintainable. But it does give the ability to have the linkages inherent in the data model easily accessible from the JSON representation. However, I’m guessing since this is an Enterprise granny, she’ll want to represent her data objects as OData, so probably going to have to deal with a few compromises!

            Thanks for the detailed reply above 🙂

            (0) 
            1. Matthias Steiner Post author

              Hi Chris,

              really happy to see you digesting all of the things and asking great questions. Thanks a lot!

              I’m not sure about the unidirectional interface. I would certainly for see a future where granny would like to see all her contacts based in say Australia, for example. Whilst the db must store some key to relate the two which could be used in a future query, it seems strange not to reciprocate that linkage in the data model.

              I think we need to distinguish between the data model and the persistence model here. Sure, within the database, we do store the primary key of the Contact as a foreign key field in the Address table. As such, we can definitely search for all our friends down under using a simple JPQL query as follows:

              @Query("SELECT c FROM Contact c JOIN c.addresses a WHERE a.country = :country")
              public List<Contact> queryByAddressesCountry(@Param("country")String country);

              Fortunately, we don’t have to worry about all of this thanks to Spring Data JPA! Based on this module we can simply add method definition in the ContactRepository interface as follows:

              public  List<Contact> findByAddressesCountry(String country);

              Spring Data JPA does all the rest for us auto-magically. This is described in detail right here:

              http://static.springsource.org/spring-data/data-jpa/docs/current/reference/html/repositories.html#repositories.query-methods

              I just added the necessary coding to the repo as well as a JUnit test to verify this. Check this recent commit.

              (We have barely scratched the surface here and in the next episode I’ll talk a bit more about ‘understanding the basics’ – especially when it comes to ORM.)

              I’ve often handcrafted a getAsJSON method for my data objects. In retrospect it’s unmaintainable. But it does give the ability to have the linkages inherent in the data model easily accessible from the JSON representation.

              :/ Hm, there may be other ways to do that. I’ll get to talk about JSON de-/serialisation in more detail once we talk about (RESTful) API. You may want to look into Jackson, which is a great library to ease your life when it comes to that.

              However, I’m guessing since this is an Enterprise granny, she’ll want to represent her data objects as OData, so probably going to have to deal with a few compromises!

              Well, Granny surely wants to look at the bigger picture and fully understands that there are many standards with varying adoption…. as such, it’s safe to assume that we’ll introduce a RESTful API as it is common practice outside of the enterprise world. If there’s demand, we can surely look into exposing OData-based endpoints as well (as it is trivial given the right library.)

              Stay tuned! 😉

              Cheers,

              Matthias

              (0) 
      2. Chris Paine

        A point for others to think on/remember if they are tempted to convert their data models from using an integer to a GUID – just remember whilst == can compare int, it’s not so useful for String…. (doh!) 😳 refactoring is soooo much fun.

        (0) 
    2. Chris Paine

      Not sure where language handling will come into it. Currently the enum don’t have any texts associated with them. so not sure how a user will know that Salutation.MS will ever render to Ms. or perhaps to something else in a different language. Mlle?

      Suggest that unless Granny is very technical she’s unlikely to know the 3 character ISO country code for all countries in the world so some sort of reference/lookup table there might be handy too… 🙂

      (0) 
      1. Matthias Steiner Post author

        Valid points indeed!

        Regarding the I18N of enums that’s something we’ll address that later-on when we talk about the UX. As you can see we have already prepared for it…

        Regarding the country mapping to ISO codes: see above. We’ll focus on the internals first, then move to the API level and then we’ll touch the UI/UX topic, where we handle this. 😉

        Keep’em coming 😉

        (0) 
  2. Harald Mueller

    Very nice. I’ll have to look into the code and try this out…

    First minor cosmetic suggestion. Please implement your blog series as doubly linked list. Then i can easily iterate to the previous or the next blog. 😉

    Did you think about transactions (JTA) and how to handle mass or bulk updates? This is usually one of the tricky problems which can easily cause DB deadlocks under heavy load. Like the fact that you use optimistic locking. From my perspective it scales better and is much more predictable and portable than pessimistic locking. And the version column is much faster and efficient than comparing values before and after the change.

    Harald

    (0) 
    1. Matthias Steiner Post author

      Hi Harald,

      First minor cosmetic suggestion. Please implement your blog series as doubly linked list. Then i can easily iterate to the previous or the next blog. 😉

      Point taken. I’ll look into it. For now, the most simple way is to use the “Incoming links” section at the bottom right of the right-hand side panel. I’ve created a master document that contains all the links to the blog posts here: Enterprise Granny

      Did you think about transactions (JTA) and how to handle mass or bulk updates? This is usually one of the tricky problems which can easily cause DB deadlocks under heavy load. Like the fact that you use optimistic locking. From my perspective it scales better and is much more predictable and portable than pessimistic locking.

      By default, all the operations created by Spring Data JPA are annotated with the @Transactional and a propagation scope of REQUIRED, hence they will either create a new transaction or join an existing one (see here.) This sounds like a reasonable default, yet it can also be overwritten explicitly as explained here:

      http://static.springsource.org/spring-data/data-jpa/docs/current/reference/html/jpa.repositories.html#transactions

      There are some bulk operations created by default as well.

      Cheers,

      Matthias

      (0) 
  3. Chris Paine

    Was wondering, why the choice to make the JPA fetch type for the relationships EAGER rather than LAZY?

    I would have thought there benefit in only retrieving data as needed? or am I taking a too pre-HANA world view into my muddlings?

    Cheers,

    Chris

    (0) 
    1. Matthias Steiner Post author

      Hi Chris,

      that decision was made independent of SAP HANA. My reason was just that in this particular scenario I rather see all the contact information as one logical record from a user perspective. As such I figured it would make little sense to load some contact information without fetching the rest of the data as well.

      This decision heavily depends on the nature of the association and the usage scenario of the data.

      Cheers,

      Matthias

      (0) 
  4. Chris Paine

    Hello again!

    I know it’s probably off topic, but any chance you could handle/discuss how you think you’d do authorisation checking? I’m trying to think how this might be relevant for Granny, but one thing that very often occurs in enterprise scenarios is having one set of users who have access to a limited subset of functionality and data, whilst another set have much greater access.

    I’m thinking of implementing this checking logic in my service classes, but perhaps it would be better to implement in the DAOs – thus removing any possibility of inadvertent access (that said, nothing should access the DAOs except the service classes so it should be equivalent).

    Spring Security looks like it might be a winner for this using the @PostFilter annotation.

    Was wondering if you had a view on if this was a good approach and whether this might be something that will be covered with Granny?

    Cheers,

    Chris

    (0) 
    1. Matthias Steiner Post author

      Hi Chris,

      for Granny what I have in mind is some sort of multi-tenancy where each user has its own address book data. Plus, I’m thinking about using OAuth for authentication. Authorization is not really a concern in that use-case as once everybody has his/her own dataset they can do with it whatever they want….

      However, in previous projects I’ve used Spring Security (aka Acegy) and it’s really great and powerful as well! Especially if you need instance-based authorization because role-based permissions fall short. A typical example are Purchase Orders… everyone is allowed to read, create and update POs, but only his own 🙂

      Give Spring Security a try!!!

      Cheers,

      Matthias

      (0) 
  5. Chris Paine

    Hiya, could I suggest that you update your persistence.xml to include the property:

    <property name=”eclipselink.cache.shared.default” value=”false”/>

    I just spent an evening bashing my head against a keyboard because although I could see the data going into the DB, it wasn’t coming back out – as cache was being read. Seems in EclipseLink the cache is on by default, which isn’t very desirable when you’re reading and writing the data through AJAX accessed services.

    Hope this helps someone else.

    Cheers,
    Chris

    (0) 

Leave a Reply