Skip to Content

Example Use Case for the Document Store

Overview

In part one of this blog series the Document Store was introduced. The most interesting part about this is the possibility to interact with the relational world inside SAP HANA, e.g. joining a collection with a table. Since this is a very abstract feature, the following sections will introduce an example use case including SQL queries for a better and deeper understanding of the SAP HANA JSON Document Store.

Introduction of the Use Case

A web shop for a various kind of products uses SAP HANA as its database. The data is being stored in a simplified data model. This rudimentary model helps to focus on the key points that this blog wants to describe.

  • The Customer table stores all information about the customers of the shop like the name and the address for delivery purposes.
  • Orders are stored in the Order table with a zero to many relationship to the customer table. Since the web shop only needs information about the creation date of an order, only this besides an identifier and the affected customer is saved.
  • Details of an order like the bought products and their amount are stored in OrderPosition table. Each order position belongs to one order and each order needs at least one order position.
  • Information about the offered products of the web shop are stored in the Products table with information like their name and their price.

Scenario

The web shop becomes now more popular and wants to improve the user experience on its website. For that, additional information about the products like artist and genre for CD’s or the number of pages and the edition of books should be shown. Besides the simple showing of additional fields, this should also be included into the search functionality, so that customers can search for artists or genre.

Solution 1: Use relational possibilities

Without using any additional SAP HANA non-relational technologies or secondary databases, the following ways are possible to store additional data in a searchable way.

Single Table Inheritance

To enable the database structure to store additional attributes, the single table inheritance pattern could be used. Each new product attribute is stored as a new column in the Product table. This pattern leads to many NULL values and denormalization, because with any new product and its new attributes, the table gets more complex and confusing.

Class Table Inheritance

The class table inheritance pattern will help to get rid of the NULL values and denormalization. For each kind of product, a new table is created with all the needed attributes.

This solution leads to many tables, especially when some attributes only needs to be saved for a subgroup of products categories. In this case, a new hierarchy level is needed. In addition, the application needs to know in which table the concrete product information is stored. If this information isn’t directly available (e.g. by context), it could be stored as an additional attribute in the base table or by looking which table contains the id.

Beside the class table inheritance, the concrete table inheritance is also usable. These two approaches differ in the existence of a base class; the concrete inheritance doesn’t use one, so all common attributes are stored in the product tables.

Store as an additional column

As a last option, the data can be stored as one string or LOB inside a separate column in the products table. The data could be formatted as simple key-value pairs, JSON or XML. To extract certain field values, e.g. SAP HANA fulltext search or JSON functions could be used, but without using supportive tools the information is mostly hidden.

Solution 2: Use the SAP HANA JSON Document Store

The relational data model solutions for this task are limited, sometimes complex and not flexible at all. The usage of SAP HANA JSON Document Store will help to solve the disadvantages in an easy and flexible way.

For that the Product table is replaced with a collection. Each document in the table should have the same fields as the table and all other attributes optionally.

{ "id": 469, 
  "price": 69.05, 
  "name": "Product 469", 
  "author": "Muriel Towe", 
  "pages": 759, 
  "edition": 4
}
{ "id": 501, 
  "price": 91.18, 
  "name": "Product 501", 
  "artist": "Emile Gain", 
  "albumname": "AlbumName 1", 
  "genre": "Country"
}

Alternatively, to the usage of only one collection, multiple collections can be used. Moreover, the Product table can be used as the parent table and only the additional fields are stored inside a collection.

Since the Document Store is a document-oriented database it has no fixed structure which allows dynamic adding or removing of fields for individual documents. This flexibility applies also to the “mandatory” fields because the Document Store can’t force the existence of fields. In that case the application has to make sure that the fields exist.

SAP HANA allows joins between collections and tables, so nothing else needs to be changed to the web shop to offer more and more kind of products. The major advantages of this solution are the easy development and fast deployment combined with the probability to change the data model in a more flexible way. Because the Document Store is an integral part of SAP HANA, no secondary database is needed. If JSON is used by the application, the documents could be used from the Document Store without additional formatting or parsing.

Use Case Example Queries

In this section example queries are presented which might be used by the web shop with focus on the special functionalities of the Document Store.

Get the favorite music genre of a customer

The following queries will return the most bought music genre of the customer with the id 2.

With relational possibilities (Class Table Inheritance):

SELECT TOP 1 CD.Genre, COUNT(*) AS "COUNT" FROM WEBSHOP."Customer" C
  JOIN WEBSHOP."Order" O On O.CustomerID = C.ID
  JOIN WEBSHOP."OrderPosition" OP ON OP.OrderID = O.ID
  JOIN WEBSHOP."CD" CD ON CD.ID = OP.ProductID
WHERE C.ID = '2'
GROUP BY CD.Genre
ORDER BY COUNT DESC;

With the Document Store:

WITH productView AS (SELECT "id", "genre" FROM WEBSHOP."ProductCollection" WHERE "genre" IS SET)
  SELECT TOP 1 P."genre", COUNT(*) AS "COUNT" FROM WEBSHOP."Customer" C
    JOIN WEBSHOP."Order" O On O.CustomerID = C.ID
    JOIN WEBSHOP."OrderPosition" OP ON OP.OrderID = O.ID
    JOIN productView P ON P."id" = OP.ProductID
  WHERE C.ID = '2'
  GROUP BY P."genre"
  ORDER BY COUNT DESC;

These two statements producing the same result but have a small difference in their SQL. The second statement must use the WITH-SELECT statement to define the fields of the documents selected by the query. The additional WHERE condition in the WITH clause filters all documents without the field genre.

Add and remove fields from documents

The UPDATE statement allows to add and remove fields to the documents. For example, a new attribute called authorDetails for a certain author (in this case Ollie Nesey) needs to be added to all documents which are containing this author:

UPDATE WEBSHOP."ProductCollection" 
SET "authorDetails" = 'This author is special' 
WHERE "author" = 'Ollie Nesey';

To remove this field again, UNSET can be used.

UPDATE WEBSHOP."ProductCollection" UNSET "authorDetails";

Return JSON in a SELECT

SELECT can be used to return individual fields, as shown above. Using SELECT *, the entire document is being returned as a string. Moreover, it is possible to define the structure of JSON documents to be returned:

SELECT {
  "productID": "id", 
  "articleName": "name", 
  "articlePrice": "price", 
  "articleType": 'book'} 
FROM WEBSHOP."ProductCollection" 
WHERE "pages" IS SET;

This returns JSON values like:

{ "productID": 469,
  "articleName": "Product 469", 
  "articlePrice": 69.05, 
  "articleType": "book"
}

Performance Recommendation

When it’s needed to use tables and the Document Store together, it is highly recommended to use precondition filters on the used collection as much as possible, as that will increase the overall performance and decrease the memory consumption as well.

Appendix

The model including sample data and all presented queries (including some more) are available in this GitHub repository.

References

Martin Fowler. Patterns of Enterprise Application Architecture (2003)

The SAP HANA JSON Document Store – Introduction (Part 1)

The Document Store in the SAP HANA Administration Guide

Maintenance of Collections in the SAP HANA Developer Guide (XS Advanced)

Document Store Statements in the SQL Reference

About the authors

Bilal Shath and Kai Mueller are team members of SHOT, the SAP HANA Optimization Team. SHOT is part of SAP HANA Development and offers dedicated project-based and strategic SAP HANA support for ISVs and startups primarily in regard to functionality and performance.

To report this post you need to login first.

4 Comments

You must be Logged on to comment or reply to a post.

  1. Former Member

    Hi Kai, et al,

    Data in an LDAP directory server is also name/value pairs that could be converted to JSON with a little code.  Has any requirement been put into providing LDAP data or results of an LDAP search into JSON and making it available to applications with SQL data?  If you look at a typical user object from the SAP Active Directory, there is a long list of attributes and values for each user object that could be made available as a JSON table in DocStore.  The attributes include group membership, address information, manager, Distribution Lists, and potential other useful information.

    Are you aware of a way that this is done currently?

    Is there a benefit to customers to make LDAP data available as a JSON table in DocStore?

    regards,

    Alan

    (0) 
    1. Kai Mueller Post author

      Hello Alan,

      please excuse the late response. I never worked with LDAP, so I’m not an expert regarding this topic.

      Since HANA and the DocStore inside of it are a “normal” database, you can just put there the data like you want. I’m not sure, but if you are asking, if HANA can consume LDAP directories then the answer is no, since you are responsible to add the data to the database.

      The befits depending on the use case. In general you can search, consume and change the data in an easy way.

      Best regards, Kai

      (0) 

Leave a Reply