Data Modeling and Database Design in ABAP Part 1
This weblog series is about database design and data modelling in ABAP. In this instalment I will discuss modelling techniques. In the next instalment I will introduce the structured entity relationship model designed by Elmar Sinz in 1990 and discuss some aspects of SAP SERM: hierarchical, aggregated referential and external relations between entities. The third part will deal with special aspects of database design in ABAP like NULL values in databases.
How to Model Data and Design Databases in ABAP
If I design an application I start with object oriented analysis and try to identify the main business objects. During the design phase I create an object model and I use this to create a relational model. In my opinion there are good reasons for doing this:
- Object models correspond to data models (like entity relationship models) in a natural way: semantic models like SAP SERM have even notations for generalization/specialization. I have to admit that most of my ABAP object models are very pragmatic: in other programming languages people define all data types as objects and as the consequence object models get very complicated and deriving a relational model can more complicated than it ought to be.
- Entity relationship models derived from object models often lead to database designs which are normalized (i.e. in third or even higher normal form).
- If I have an object model in mind I have to distinguish between attributes and methods. It is a common mistake that people try express processes in databases. If they do that the result will get more complicated as it should be.
There is another difficulty: in an object model there are sometimes complex relationships (like higher order associations) between objects. If I create a data model from an object model I have to decide which relations must be represented in the database explicitly and which can be constructed implicitly while instantiating the ABAP object.
Last but not least I have to create the relational data model: I have to define the transparent tables and foreign keys. During this design phase I will have to cope with following problems:
- How should What are my primary keys? Should I take technical keys perhaps GUIDs?
- Is it possible to avoid (resp. does it make sense to use) composite keys?
- What indexes do I need for fast access? Or is it better to use buffering in fact indexing and buffering are complementary concepts.
- How do I realize derivation of objects in the data base?
Some people think that they dont need to define foreign keys in the ABAP Dictionary because they are not represented in the underlying database. In my opinion this is completely wrong: foreign keys are necessary in dynpro programming, for various DDIC-views, view clusters but the most important thing is that they make the data model easier to understand.
Usually I take pure technical keys (number ranges, counter or even better: GUIDs) to express object identity in the area of transparent tables. But sometimes it is necessary to define a so called value-based search key (realized as a unique index) to access the object in terms of its attributes. And sometimes I use composite keys but I will come back to the next point later.
As the result of my work I obtain a relational data model. Then I try to find out whether it is appropriate:
- Is the database design normalized? Do I have to denormalize because of performance reasons?
- Can the relational database model be extended easily?
- Do I have to archive the data? Can I easily define an archive index? Can I read the archived data easily using the SAP standard?
- It is easy to create a business information warehouse data extractor for the data model?
- Do I allow customers do define APPENDs for transparent tables? Can this affect archiving?
In this phase can apply the usual techniques we know from database design. In fact this takes a lot of time and must be done with care:
- It is very difficult to change an existing data model: we dont have SQL DDL commands like ALTER TABLE. If a transparent table contains lots of data its possible the database utility se14 wont work. And sometimes we even have to write XPRAs for changing the structure of transparent tables.
- If the structure changes it can affect archived data and of course archive objects.
And last but not least I have to write documentation of my data model. But to be honest most programmers dont like to write documentation and most documentation has poor quality. I know technical concepts that even contain huge lists of transparent tables together with their data elements, foreign keys and so on. Of course that can be useful because it describes the DDIC at a certain point in time but this information can be easily generated from the DDIC. So in my opinion it is better to write a program that extracts this information so that programmers dont have to rewrite and reformat documentation printed out in text files.
In some ABAP projects chief designers think that it is helpful to write those concepts even before creating the transparent tables. Of course this is possible and often necessary if the existence of an optimized and well reviewed relational model is crucial for the success of the whole development project. But as a consequence we have to change those concepts every time we decide to modify parts of our relational model during design phase. As a consequence programmers create dozens of pages of documentation that has to be rewritten again and again. At the end nobody knows whether the documentation describes the current situation.
So here is my suggestion:
- We define a data model (perhaps derived from an object model) perhaps with the help of the SAP Data Modeler.
- At this time we define the relational data model and link the semantic entities to the SAP DDIC.
- Of course we need guidelines for the design of the relational data model: it makes no sense if somebody uses GUIDs and another developer number ranges for primary keys at the same time (perhaps even in the same data model).
By using the SAP Data Modeler we keep the documentation in the SAP system.
A first look at SAP-SERM and the Data Modeler
SAP decided to create database models on two layers. The semantic layer is defined in SAP-SERM and the logical model is defined in the data dictionary. Usually we dont work with the physical model (the underlying database) unless you are a database administrator. Only if there is serious trouble an ABAP developer will look at the physical database table this can be necessary if a transparent table is inconsistent (the database object doesnt correspond to the DDIC object) or we have serious problems to change the transparent table using the database utility (se14) or even with ABAP.
Semantic data models are really useful. Just imagine you want to understand internals of ABAP classes and how they are represented in the WAS itself. Just look at the package SEO and you will recognize Data Models and Entity Types:
Just select the first data model OR_COMPONE and press Ctrl. F3: the Data Modeler opens:
You can double click the entity type OR_REGMET: You will navigate to the entity type and by click the button dictionary you will see that this entity is represented by the view VSEORMETH. Furthermore this diagram says that a method can be a static class method or an instance method.
Suppose that you have to understand someparts of the CRM data model. Luckily most modern SAP applications are structured and divided into packages and sub-packages but usually this doesnt help you to understand the data model. (But just imagine a whole application would consist only of a single package you would have no chance to understand or to maintain it). Fortunately data models will help you. Just look at the data model CRM_BP0021 for example:
We all know entity relationship diagrams but what does this notation mean? It means that a business hour consists of several (time dependent) intervals. Especially there is no business hour interval that doesnt belong to any business hour. Furthermore we can think of an business hour as a base class with some specializations (child classes): visiting hour, calling hours and receiving hours. This will be explained in the next instalment of this blog series.
I briefly discussed the basics of data modelling and database design in ABAP and suggested some best practices. I shortly introduced the SAP Data Modeller. In the next instalment I will get into the details of SERM and SAP SERM.