Skip to Content
Technical Articles
Author's profile photo Cameron Swift

SAP HANA Cloud Document Store – When Would I Use A Document Store?


There are a number of wonderful blog posts here on the SAP Community talking about how you can get started with SAP HANA Cloud Document Store, and what you can do with it

For this blog post, I wanted to meet absolute beginners where they are:

  • Introducing the design considerations behind using the more familiar Relational (Row and Column) Stores
  • Discussing where we might run into challenges when using Relational Stores
  • Looking at where a Document Store might help
  • Explaining at a high-level what SAP HANA Cloud Document Store is

This blog post is intended to be a high level introduction – it won’t address all design considerations, and will simplify where able. Those looking for more in-depth technical articles can find some recommendations under Other Resources at the end of this post


Relational Stores Recap

When working with a database using Relational Stores, we want to start by deciding how we’ll organise and structure our data. This structure is called a Data Model

To recap a very simplified Relational Data Model, let’s look at how we might store information about a book


Our simplified book data

You can describe this book in many other ways – it may be brown, it may be sold in a hardcover, and it may be 388 pages long

We only want to store information that will be useful to our business. For our example, the list above will be enough

With an understanding of which information we care about, we know what we’ll need to store in a Database Table. Someone looking at this problem for the first time might sketch out a table as below


An initial table

A book’s ISBN is unique – each ISBN refers to exactly one published book. This uniqueness lets us use the ISBN as something called a Primary Key (indicated by the key symbol)

By looking at our Primary Key Column (in this case ISBN), we can tell which book each row of data is about

Without knowing better we deploy our table in a database and people begin using it. Before long, our colleagues come to us with issues


Challenges with “One Central Table”

A colleague wants to search for all books written by Cameron Swift. Except it turns out there’s more than one person in the world called Cameron Swift

To help tell them apart, we add an Author ID that is unique to each Author. While we’re at it, we give our Publishers a Publisher ID too


Adding IDs to our Table

A second colleague wants to enter a book with two Authors.In theory, we could add another set of Author ID and Name columns. Unfortunately, that won’t help us when we need to enter a book with three Authors, let alone four or more Authors

To help us store this data, we start by separating the data into multiple tables. We move the Author and Publisher data out into their own tables


Splitting our data into Books, Authors and Publishers Tables

As we can see above, the Author ID and Publisher ID also uniquely identify each row in our Authors and Publishers tables. These are the Primary Keys for those tables, in the same way that ISBN is for our Books table


“Relational” Data

There’s no point storing our data across different tables unless we know how they’re linked so we can get it back out again

This is where the term Relational Data comes from – in order to make sense of the data, we need to understand our tables and the Relationships between those tables

The way we represent this relationship is by adding the Primary Key from one table as a column in another table

As an example, we’re going to link the Publishers table to our Books table by including the primary key for our Publisher table (the Publisher ID) as a column in our Books table


Linking our Books and Publishers Tables

By matching up the Publisher ID columns from our Books and Publishers tables, we can now easily find out the publisher for each book. Now that we understand how this works for Publishers, we can look at our Authors


Including Multiple Authors

When we think about including the Author IDs inside our book table, it may sound like we have the same problem we started with. As we showed earlier, we can’t include multiple Author IDs in our Books table, because there can be any number of Authors (with the exception of zero, because books don’t write themselves)

We need to create a third table to link our Books and Authors tables. We’ll refer to this as a Linking Table. This table is a list of ISBNs and associated Author IDs


Our BooksAuthored Table links the Books and Authors Tables

Because a book can have many authors and an author can write many books, neither column here is unique. This is by design

Starting with the ISBN for any given book, we can then look in our Linking TableEach row with a matching ISBN points to an author of that book.

Using this approach means we don’t have to make any changes to the structure of our tables as we add multiple authors. For each additional authorwe just add another row to our Linking Table


Our Relational Data Structure

Our full data structure for storing this simplified example would look something like this:


Our full data structure


Challenges with Relational Tables

Working through these challenges makes something very clear:

When new business requirements arise, these could require changes in our underlying data model. As we saw above: even with a simplified scenario – changing data models is not a trivial task

We can’t change the set of columns for just a single row – all rows within a table use the same set of columns


Business requirements often require changes to our data models

Depending on the scope of the change required, we could be looking at adding columns to a single table (like we did with Author ID and Publisher ID), or more complex changes requiring re-mapping our data across a number of tables

In this case, we’ll likely need to update the logic that writes to and reads from these tables – everywhere we use them

Does this mean that storing data in Relational Stores is a bad idea? Absolutely notThere are benefits to storing data in this way – we often want to enforce a known, fixed schema and we want any variation to be considered and approved. This is where Relational Data storage might be the best fit


Document Stores

On the other hand, there are times where the fixed approach of predefined columns and tables isn’t the best way to store our data

Maybe we want to store nested data – for example, our books where we can have any number of authors – without splitting the data between different tables

Perhaps we’re receiving data from Cloud APIs: if we’re getting data from Ariba’s APIsthe data isn’t returned to us as entries ready to be stored in a table. Instead, we’ll likely get results formatted in something called JSON (JavaScript Object Notation)


JSON Documents

The term JSON Document is used to describe either JSON Objects or JSON Arrays. We’re only going to describe JSON Objects in this blog post

Inside our JSON Objects, we have a number of Key/Value Pairs. For example, [Book] Title is a Key, and the Value for this might be Document Storage. Let’s revisit our earlier example


Relational (table) structure vs JSON Document structure

Above we can see how the properties of a book could be modeled in either a Table or JSON Document. Our JSON Document is an object within curly braces { }, made from a number of Key/Value Pairs

The JSON Document is self-describing. That is – if we retrieve the book data, we can check the keys to tell what each value represents. If we retrieve a single row of data from a table, the column names aren’t included by default


Retrieving Book Data from Relational vs Document Stores

With our small number of columns, this is workable. Are we going to be able to remember fifty columns? A hundred? Nobody has that kind of patience


JSON Collections

Because our JSON Documents store the structure alongside the data, we can easily vary the structure between entries. Unlike rows stored in a table, JSON Documents are stored in something called a JSON Collection

To help us understand this, we can picture each JSON Document as a physical sheet of paper with our data written on it

Working with a blank piece of paper, we can write our data exactly as it is – we don’t need to stick to a fixed format

Loose sheets of paper aren’t great for storage and retrieval, so we store our related documents in a Manila Folder. This folder is our JSON Collection


JSON Documents and Collections

When you store a JSON Document, you put it in a related JSON Collection. When you want to access the Document later, you retrieve it from the same Collection

Using SAP HANA Cloud Document Store, we can do this using a modified SQL Syntax. But that’s a story for another, more technical blog post


Document Storage and Retrieval

SAP HANA Cloud Document Store is an additional feature that can be activated within an existing SAP HANA Cloud instance – for more information you can read the blog post linked under Next Steps below 



In this blog post we’ve discussed Relational Storage using Tables, including the benefits and limitations of working with a fixed, known data structure

We’ve also discussed how we can work with data that doesn’t conform to a fixed structure using JSON Documentsstored inside JSON Collections

As always, the practical reality is more complex than the theory. Hopefully this has been a useful high level introduction to when you might use a Document Store like SAP HANA Cloud Document Store

I welcome any questions or comments below


Next Steps

For more Practical Application:

Maybe you’d like to find out how you can activate SAP HANA Document Store within SAP HANA Cloud

Enabling JSON Document Store by Cameron Swift

Perhaps you’ve already activated SAP HANA Document Store and want to get some experience with inserting and selecting JSON Documents

HANA DocStore First Steps by Cameron Swift


For more Theory:

You may have heard the term NoSQL Databases before. Document Store is a type of NoSQL Database, but isn’t the only type. Perhaps you’d like to learn about another type of NoSQL Database, the Graph Database

SAP HANA Database as a Graph Store – Introduction by Poorna Malamandi Suresh


Other Resources

Enabling JSON Document Store by Cameron Swift

HANA DocStore First Steps by Cameron Swift

The Small JSON Document Store Cookbook by Mathias Kemeter

Spotlight: SAP HANA Cloud JSON Document Store by Laura Nevin

The SAP HANA JSON Document Store – Introduction (Part 1) by Kai Mueller

Ariba Analytics using SAP Analytics Cloud, Data Intelligence Cloud and HANA DocStore – Part 1 by Cameron Swift


Note: While I am an employee of SAP, any views/thoughts are my own, and do not necessarily reflect those of my employer

Assigned Tags

      You must be Logged on to comment or reply to a post.
      Author's profile photo Peter Baumann
      Peter Baumann

      Hi Cameron Swift!

      Thank you for your efforts explaining usage of JSON on HANA from the different perspectives in your blogs.

      For my understanding JSON Document Store is a SAP HANA functionality. As HANA Cloud makes also a relational Data Lake and a Data Lake files available. How far is it possible today or maybe planned to query or in general work with JSON Documents over the different levels?

      Author's profile photo Cameron Swift
      Cameron Swift
      Blog Post Author

      Thanks for reading Peter, I appreciate your question. As an SAP employee, I can't comment on future functionality. Let's talk a little about where we are today

      The JSON Document Store is an SAP HANA Cloud functionality that needs to be explicitly activated, like the Script ServerJSON Collections are stored within SAP HANA Cloud and automatically loaded into memory on first access - although you can explicitly load or unload Collections using Document Store Statements

      One of the benefits of SAP HANA Cloud's multi-model capabilities is the ability to use different types of data together within SAP HANA Cloud. For example, using a single statement we can return an implicit view joining both Document Store and Column Store data

      So while the Document Collections aren't stored in SAP HANA Data Lakeyou should be able use JSON data alongside Column or Row data stored in HANA Data Lake's Relational Engine. I would recommend reading Crossing Engine Boundaries in our Documentation for more information here

      I hope this gives you some direction in your learning - a full breakdown of this would probably be a separate blog post in and of itself


      Author's profile photo Golnoush Lotfi
      Golnoush Lotfi

      Hi Cameron Swift,

      Does JSON Document Store guarantee the uniqueness of the documents? if so how we can achieve it?

      Author's profile photo Mathias KEMETER
      Mathias KEMETER

      Hi Golnoush, currently the JSON Document Store does not have a functionality to ensure uniqueness of documents. Of course, you can check for uniqueness manually. However, I understand, that this can be cumbersome and impacts performance.