Skip to Content
Author's profile photo Matthias Steiner

Dynamic Duo – CAF and JPA interplay

Introduction

The readers of my blog (or my book) already know my standing on model-driven development tools. While I appreciate the convenience to graphically model my business objects and the productivity gain it brings I believe that such tools however do not free your from the necessity to familiarize yourself with the underlying technology. Otherwise you depend on technology you have not mastered…

SAP’s model-driven Composite Application Framework (CAF) is based on standard Java 5 Enterprise Edition features and consequently uses Java Persistence API (JPA) for object-relational mapping (ORM) and data storage. These two technologies can go hand-in-hand so that you can use whatever is best-suited for your particular use-case or even use both technologies seamlessly within one project. Sounds interesting? Then let’s get it on…. 

JPA Fundamentals

While a deep-dive into JPA is certainly out of scope of this blog I guess it’s best to quickly summarize the workings of JPA up front. In principal, JPA provides an object-relational mapping functionality that allows Java developers to concentrate on Java objects without needing to worry about how to store them in the database etc. All the low-level details dealing with JDBC drivers and (Open-) SQL are shielded by JPA. The heart of JPA is the EntityManager, which provides all the life-cycle management operations (CRUD operations: create, read, update, delete). Furthermore, it also manages other aspects such as caching and transaction management.

The objects that are persisted are pure Java beans (POJOs), which just have been annotated with the @javax.persistence.Entity. Several more annotations come into play like @javax.persistence.Table, which specified the name of the database table. However, I’ll not get into further detail here, but instead refer you to the list of references at the end of the blog for further reading information on the topic.

CAF’s usage of JPA

The Composite Application Framework provides a design-time modeling environment as a distinct Eclipse perspective within the NetWeaver Developer Studio (NWDS). Here you can graphically model your business objects (BOs) and during generation, the necessary database tables (in the Dictionary DC) and the corresponding Java classes (within the EJB Module DC) are generated automatically. This frees the developer from all the repetitive and tedious programming of CRUD operations etc.

All this coding is generated in a separate source code folder called src, while the user generated coding resides within the ejbModule folder. In a nutshell, CAF generated a Stateless Session Bean, which acts as a facade and provides the CRUD operations. Furthermore a class with a BO suffix is generated, which is the @Entity-annotated Java Bean.

CAF generated JPA classes

Note: In earlier releases of CAF the BOs were called Entity Services, which personally I prefer over Business Object as it comes closer to their intention – in fact, CAF BOs classify as Data Access Objects (DAOs).  

Real life examples

In of of our recent custom development projects we were challenged by the requirement to parallelize Enterprise Service (ES) calls to multiple backend systems (another upcoming blog). In order to achieve this, we used asynchronous Web Service proxies to dispatch the real ES invocation and temporarily store the intermediate results. As soon as each service call returned a response the UI was notified and the data was displayed. In order to avoid unnecessary database growth (each query may return as many as up to 5000 elements) we deleted all that data once the user session terminated.

Originally we used the standard CAF-generated CRUD operations for storing, retrieving and deleting this temporarily data, yet the results did not meet our performance expectations. After analyzing the hand-crafted code we realized that there was no potential to improve the coding nor the execution speed. “So, what do to?” we asked ourselves and started further analysis.

Performance considerations when using CAF 

Now, if you think this now leads to CAF bashing or ranting, then you’re wrong. 😉 In fact, there’s nothing wrong with the design of CAF nor the JPA operation it generates. The issue is more subtle…

As I keep saying (to everyone willing to listen) model-driven development tools are bound to define a common set of assumptions and standard use-cases, which they support, as otherwise the tools would become equally complex to handle as the underlying technologies. That’s just fair as it greatly flattens the learning curve and help developers to get productive. Yet, sometimes this default behavior does not match the requirements as in our case.

CAF is not meant to be a tool for mass database operations – simple as that. The reason for this is quite technical and in order to understand it a closer look on the workings of JPA and CAF is required.

In principal, the EntityManager is capable of “caching” several database operations w/o directly performing these operations to the database immediately. In fact, the corresponding database operations (SQL statements) are performed, once the EntityManager is flushed. Typically, this is done in alignment with the transactional context the operation runs in. (The FlushModeType can be set to your liking via corresponding methods exposed by the EntityManager.)

So, in the above mentioned scenario we simply queried for all objects with a specific session ID and then looped over them and called the corresponding delete operation. The results were disappointing! We then decided to implement a plain JPA method within a separate DAO class, which used a simple JPQL query and the deletion was more or less instantly.

Yet, our performance issue was still not completely solved as storing the temporary data received from the backend to the database took equally long. The reason is that the CAF-generated JPA coding flushed after each single record.

Again, we created another JPA operation within our new DAO and implemented functionality that allowed us more control on when to flush by making the threshold configurable. By only flushing once and hereby storing all records (~ 5000) at once we gained performance by the factor 80. Obviously, maintaining such a big amount of operations in the EntityManager’s also results in bigger memory consumption so at the end we fine-tuned the behavior to flush after ~ 200 objects, which was the best compromise between memory consumption and performance (and it was still damn quick!)

The nitty-gritty details

Now, if you’re curious on how to mix CAF-generated JPA operations with manual ones – read on. It’s pretty simple in fact and you only have to keep in mind a few simple rules to get it going.

  1. The DAO class needs to reside in the same original ejbmodule DC of the CAF BO (which is the best suited place for it anyway!)
  2. The reference to the @javax.persistence.PersistenceContext needs to point to  the unitName defined within the persistence.xml in the META-INF folder, like shown below:
    @javax.persistence.PersistenceContext(unitName="demo.sap.com.ext_tech2.composite")
    protected EntityManager entityManager;

  3. In the BO-suffixed class the Entity name is defined, which is required in JPQL queries.

Summary

Let me conclude by providing you some background information when it may make sense to manually implement JPA data access operations.

  1. Hierarchical structures
    CAF does only support flat structures and hence you’re forced to use plain JPA coding if you need/want to read/store a structure of business objects in one go. Please note that you also need to consider loading strategies (eager/lazy loading) and attaching/detaching JPA Entities from the EntityManager by yourself.

  2. Mass updates
    As this article illustrates the inherent flushing mechanism of CAF may contradict your use-cases and consequently you may want to use plain JPA or even direct SQL statement instead.

  3. Joins and/or Data Transfer Objects
    A common use case is that some UIs may provide a so called Object Worklist (OWL) that allows for searching for instances of business objects based on entered criteria. Typically, an overview list is provided that show important attributes of BOs. Such overview screens do not require all data of a BO or even display attributes form several BOs in one screen. Usually, Data Transfer Objects (DTOs) are created.

Further Reading (to be updated)

Assigned tags

      19 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Former Member
      Former Member
      Hi Matthias
      It seems that for real life applications, you are forced to use JPA anyway. So as a developer you need to learn two frameworks for the persistence layer.
      I also think, that the advantages of CAF (automatic table generation, modelling entities etc...) disappear with the rise of Eclipse WTP. WTP already provides forward/backward mapper (entity<->table) as well as graphical modelling of entities.

      The blog itself however is written very well, and I would love to see more real life examples of using our technology.

      Author's profile photo Timo Renner
      Timo Renner
      Hi Matthias,

      first of all thnx for that Blog. You exactly address a topic that many people like to forget when using complex frameworks that do most of the coding for them: the frameworks NEVER cover all aspects perfectly, and there is always potential tuning possible for experienced developers.

      To answer on Raphael's question: I understand that you question the usage of CAF, especially as an experienced Java developer.
      But used in the right way, CAF helps you to minimize the effort for the persistence development and concentrat on your business logic. And the right way means: no mass operations, only load data you need, etc...

      For me it's a great complement to easen Java persistence.

      Keep up the good work.

      Timo

      Author's profile photo Matthias Steiner
      Matthias Steiner
      Blog Post Author
      I appreciate your comments Rafael and Timo.

      Well, I guess it really depends on the use-case you are faced with. At Custom Development we sucessfully used CAF in a variety of projects w/o the need to fine tune any JPA oeprations like mentioned in my blog.

      One of teh best things about CAF is it's flat learning curve, so every seasoned Java developer should be able to use it with minimal effort - especially if you know the underlying technology EJB 3.

      Let's also not forget that CAF is much much than just a model-driven ORM tool. A lot of stuff came out of the box like instance based permisison checks etc.

      I see it like Timo, the two technology conplement each other. Use them where they fit best and by-pass them gracefully when you need advanced features.

      And yes, WTP is cool... I already saw some good stuff like the new JPA modeller that ships in the next release. Two-way engineering (model-> code, code-> model) is certainly the best solution in the long run...

      Author's profile photo Former Member
      Former Member
      I really appreciate you brought this idea, I like CAF as well as JPA is future. good to combine both.
      Author's profile photo Matthias Steiner
      Matthias Steiner
      Blog Post Author
      Hi Ritesh,

      thanks for your comment. I fully agree... good that the two technologies can seamlessy work together so you have full flexibility 🙂

      Cheers,

      Matthias

      Author's profile photo Siarhei Pisarenka
      Siarhei Pisarenka
      80/20 rule works here very well. CAF (as any other big framework) covers ~80 of all development cases in the target area. The rest ~20% is tricks, workarounds, deep tuning, customizations, etc.
      Author's profile photo Matthias Steiner
      Matthias Steiner
      Blog Post Author
      Hi Siarhei,

      yeah.. the 80/20 rule is valid here 😉

      Unfortunately, the last remaining 20% may require 80% of the TCD (total cost of development.)

      Cheers,

      Matthias

      Author's profile photo Former Member
      Former Member
      Hi folks,

      The CAF layer automatically provides the most atomic but important CRUD operations which would have reduced developer's productivity if written on our own.

      But sometimes we just need more raw control over your DB entities.
      We have also tried the CAF/JPA interplay in our custom development.
      This blog contains further details for those interested:
      How to implement customized BO operations

      Keep up the good work guys!
      We love CAF & JPA!

      Cheers,
      TC

      Author's profile photo Matthias Steiner
      Matthias Steiner
      Blog Post Author
      Hey TC,

      nice blog you have written there!

      Guess I 've missed it somehow, but I look at it from the positive side of things: independent of each other we have both found a way to use the two technlogies in a complementary manner - what more to ask for?

      (An iPhone comes to my mind, but hey - it's X-mas time soon, so who knows?)

      Cheers,

      Matthias

      Author's profile photo Former Member
      Former Member
      Hi Matthias,
      Thanks for the compliments!

      Yes, what more to ask for when you can blend both powerful technologies to solve business problems :p

      (iPhone rocks!)

      Happy holidays & a Merry Christmas in advance.

      Cheers,
      TC

      Author's profile photo VINCENZO TURCO
      VINCENZO TURCO
      Hi Matthias,
      great blog, thanks for sharing!
      I would like to use CAF-generated EJBs with WDJ's EJB model to cut WS overhead and service group configuration. Can that be done?
      In addition, I would like to reference CAF-generated EJBs from custom EJBs using dependency injection (@EJB annotation). Is this feasible as well?
      Thanks, regards,
      Vincenzo
      Author's profile photo Matthias Steiner
      Matthias Steiner
      Blog Post Author
      Hi Vincent,

      a simple YES to both 🙂

      (If the custom EJB reside in a different DC, make sure that you add the necessary dependencies to CAF API)

      Cheers, Matthias

      Author's profile photo VINCENZO TURCO
      VINCENZO TURCO
      Hi Matthias, thanks for the quick reply.
      A couple more questions:
      - is optimistic locking somehow supported (@Version)? I've tried to add annotations but once the code is regenerated everything is lost

      - how can I achieve a behavior similar to @GeneratedValue and @Table/SequenceGenerator for auto-generating IDs?

      Maybe a solution for both problems might be using orm.xml to ovverride settings, would that work?
      Thanks, regards

      Author's profile photo Matthias Steiner
      Matthias Steiner
      Blog Post Author

      Hi Vincenzo,

      CAF does take care of optimistic locking based on the last-modified time stamp - so if you transport these values to your consumer and back you'd get OptimisticLockingExceptions (OLE).

      On the 2nd question I'd like to know teh use-case as CAF is generating GUIDs automatically.

      If you want to fully leverage JPA features you may need to abandon CAF altogether...

      Cheers,

      Matthias

      Author's profile photo VINCENZO TURCO
      VINCENZO TURCO
      Hi, thanks for your reply, I now understand that standard BO keys are auto-generated while custom keys are handled explicitly by app developer.
      Author's profile photo VINCENZO TURCO
      VINCENZO TURCO
      Hi Mathias, I was wondering if in CAF we have something similar to JPA persistence contexts. In JPA, entities are in states: "new", "managed", "detached", "removed" with regard to entityManager and thus persistence context.
      Therefore when we modify a "managed" pojo, changes are committed to db at transaction end, with no need of explicitly calling a flush operation (even though it's allowed). How is this handled in CAF? Is calling "update" mandatory?
      Thanks regards
      Vincenzo
      Author's profile photo Matthias Steiner
      Matthias Steiner
      Blog Post Author
      Hi Vincenzo,

      well, CAF shields all that from you. AFAIK, a flish is always executed at the end of a CAf BO's update method.

      Best regards,

      Matthias

      Author's profile photo Former Member
      Former Member

      Matthias,

      Great blog, I have a little bit different scenario where CE being used mostly as middleware tool between ECC and external scheduling engines rather than a front end tool. I am planning to develop a app service on CE which would connect to external scheduling engine and gets the required data and processes it (adds the business logic with the local data stored on CE BOs). For each ECC sales order SAVE my CE app service gets called and appropriate response being sent to ECC before an order gets saved. Under normal circumstances I would assume my service to perform without out any issues, however my question here is during the initial load (conversion at the time of go-live) there are about 100000 orders that need to be created in ECC from legacy data and for each order my service gets called (100000 times). I am not sure how this would work or not work, what kind of issues I may face, any possible resolutions or even designs that need to be followed ( as I did not developed it yet). I would really appreciate if you could share your thoughts on this scenario or if at all CE service can be used in this particular business scenario.

      Author's profile photo Matthias Steiner
      Matthias Steiner
      Blog Post Author

      Hi Seshu,

      hm, to be honest 100,000 records do not really sound like there should be a problem, especially since you said it's only for initial load. I assume that the initial load can takefor 5-10 minutes w/o causing too much of a problem, right?

      I mean the only work-around I could think of would be to transfer all orders in one batch to CE, yet that would most probably cause different problems of another kind as the payload of such a message would be huge.

      So, I'd simply write a short unit test that mimics the initial load and see how long it takes... but then giving the entire tim eit would take to get from ECC to CE and back, including all the WS/ES handling (XML marshalling etc.) I'd say that the time to persist the data is relatively low...

      Let me know your findings...

      Cheers,

      Matthias