Technical Articles
The SAP HANA JSON Document Store – Introduction (Part 1)
Introduction to the Document Store (Part 1)
Overview
The SAP HANA JSON Document Store (also known as DocStore or Document Store) is a new feature which has been introduced with SAP HANA 2.0 SPS 01. The new store combines a relational and document-oriented database to a hybrid innovative technology which is unique for a variety of reasons, namely, its ACID compliant, fully integrated with SAP HANA in terms of access/query and administrative capabilities.
The embedded Document Store belongs to the group of “NoSQL” databases, more precisely to the document-oriented ones. These type of storing technologies are storing semi-structured documents (most JSON or XML) in collections without an explicit structure which offers high flexibility and compactness.
Beside the fact that the Document Store offers the possibility to use a document-oriented database directly and fully integrated in SAP HANA without the necessity of operating another independent database in parallel, it features full ACID properties. This way a single transaction may span all stores of SAP HANA and offer the same qualities in terms of atomicity, consistency, transaction isolation and durability. Given that the Document Store is a regular SAP HANA service, the known features Backup & Recovery, System Replication and Failover work out of the box without additional administrative overhead. SAP HANA allows interactions – especially joins – between collections and relational database objects like tables. Furthermore, with complex path expressions it’s possible to extract relevant portions of the document.
Terms
Beside the known terms like tables or schemas, this blog and the documentation of the Document Store uses some (new) terms which will be explained as follows:
Semi-structured data: Data which is not fixed in its structure but has the structure information in itself. In contrary, structured data like tables has a constant or fixed structure which must be defined before inserting data.
Collection: A collection holds multiple documents and is assigned to a schema. This is comparable to a table with the difference that a collection doesn’t have a predefined structure (column definition).
Document: A document in the Document Store is a semi-structured document in the JSON format. Such a document is like a row in a table. In this analogy the keys of the JSON document are the columns of the table.
Statement Examples
Since the Document Store is being used in relational database context, SQL is used as the query language. For that some new expressions and keywords where introduced to enrich SQL with the needs of the Document Store. In the following section the most commonly used statements are illustrated. This is only a simple statement explanation, for further detailed information kindly refer the SQL Reference.
Enablement of the Document Store
Since the document store is implemented as an additional store in SAP HANA that comes with its own process, it has to be enabled by the administrator in the SYSTEMDB for a specific tenant.
ALTER DATABASE <database> ADD 'docstore';
Create a collection
This statement creates a new collection called MyCollection into the current schema. This is like CREATE TABLE, but without defining the column characteristics. Users can create as many collections as needed.
CREATE COLLECTION MyCollection;
Drop collection
By using the DROP COLLECTION statement, the whole collection will be deleted. This statement behaves like the known DROP statements.
DROP COLLECTION MyCollection;
Insert
The insert statement of the document store takes one JSON document as an argument without an optional column definition. The newly document must be valid JSON, but documents may have different identifiers or structure.
INSERT INTO MyCollection VALUES({
"name":'John Doe',
"address": {
"city": 'Berlin',
"street": 'Street 22'
}
});
Select
Selecting values from a collection is similar to the selection from a table. Furthermore, it is possible to access nested fields via a path by using the dot operator. The statement is tolerant to non-existing fields.
SELECT "name", "address"."city" AS "city" FROM MyCollection WHERE "name" = 'John Doe';
This returns a result set with the columns name and city where the name equals John Doe.
Update
To perform updates on the data, the update statement should be used. Beside the simple updating of values, this operation can be used for adding or deleting field or for replacing whole documents.
UPDATE MyCollection SET "address"."city" = 'Munich' WHERE "name" = 'John Doe';
Delete
As the statement name implicates, it deletes documents from a collection.
DELETE FROM MyCollection WHERE "name" = 'John Doe';
Conclusion
SAP HANA already provides capabilities for graph, spatial, hierarchies and for relational tables of course. By introducing a document store the set of capabilities is enriched. This enables applications that are built on SAP HANA to use the best from each database technology. Especially they can mix different technologies with the well-known relational world in an intuitive way. This leads to many advantages, such as the ability to use a flexible and dynamic kind of storing data and the availability of using both database technologies at the same time. Overall it reduces administration overhead since only one database needs to be maintained and offers innovative development.
References
A short introduction into JSON
The Document Store in the SAP HANA Administration Guide
Maintenance of Collections in the SAP HANA Developer Guide (XS Advanced)
Document Store Statements in the SQL Reference
About this series
This blog series about the document store is splitted into two parts. In the first one, the document store is introduced with an overall overview.
In the second part a use case of the combination of relational and document oriented will be presented together with SQL samples.
Thanks Kai for the blog. Do i see some similarities of document store to mongoDB, if yes than will be great if you can highlight in a table what is same and what is different? It will be easier for everyone to compare and understand.
Hello Nabheet,
thank you for your comment. In my opinion it wouldn't make sense to compare the DocStore and MongoDB on this detail level, since the DocStore is part of SAP HANA and SAP HANA and MongoDB are complete different databases.
In general your're right: they have the base idea in common. So, in both databases you have documents in the JSON format, CRUD operations to interact with them, collections which hold these documents and schemas/databases which storing collections.
One big difference is the query language: DocStore uses SQL (with JSON extensions), MongoDB JavaScript/JSON. Also, MongoDB can handle XML documents. The DocStore has the possibility to interact directly with a relational database, which isn't supported by MongoDB.
I hope this answers your question.
Best regards, Kai
Thanks, Kai for the post. I've briefly played with DocStore in HANA Express. A quick question, would DocStore be an appropriate HANA DB Store to save large text file content? (In my case, assume it's something like markdown text for a wiki application) Any sort of item length limitations I may run into?
Hello Mike,
thank you for your nice words. Are you aware, that the SAP HANA JSON Document Store can save only JSON documents. If you want to save markdown, you need to convert or add it to a JSON document.
Best regards, Kai
Hey Kai,
Yes, I realize it must be in JSON format, however I'm more asking if there is a size limitation per-item.
Example JSON item:
Hello Mike,
the current limitation is 8MB per JSON document. Unfortunately I can't tell you the exact same of measurement, but just use 8MB as the upper limit.
Best regards, Kai
Thanks, Kai! 8MB per item is perfectly reasonable in my use case.
Hi Kai,
Very nice blog, thanks a lot for sharing.
I have a question here, will SAP ABAP benefit from json document store? Can I combine standard SAP report with this feature?
Thanks,
Season
Hello Season,
thanks for the kind words. As far as I know you can't use the DocStore directly in ABAP since ABAP only knows tables as storage objects.
But you can try to use collections and json in AMDPs with SQLScript as the language.
Best regards,
Kai
Hi Kai,
Thanks for this nice blog about Document Store (both Part1 and Part2).
I tried using HANA Document store through Java App using JDBC(ngdbc.jar). I am able to perform CREATE COLLECTION and INSERT INTO Collection but not able to perform Select on collection as using ResultSet to get the value is not serving the purpose of getting JSON Data from collections.
Is there any way to get data from Collection in Java (any API or Procedure) ?
Thanks..
Amitanshu
Hello Amitanshu,
JSON is returned as a CLOB in the first column, which you can convert to a String quite easy:
Of course this will only convert this to a string, but then you can use well known JSON libraries to work further with this.
I hope this helps.
Best regards,
Kai
Thank Kai !! It's working perfectly. I was getting CLOB but was doing it using ResultSet metadata and hence it was not working as desired. 🙂
Best Regards..
Amitanshu
Hi Kai,
I have below list of queries on this topic.
Thanks
Hi Kai,
We are wondering if SAP Hana DocStore can supports same level of data aggregation capabilities as MongoDB provides like aggregation pipelines or MapReduce or bulk operations. If yes, how can we achieve it?
Regards,
CS
Hello Chatar,
you can use all SQL aggregation capabilities offered by SAP HANA, but there are no special ones for the SAP HANA DocStore.
Best regards,
Kai
Hello Kai, I tried the following:
CREATE COLLECTION TABLE MyCollection1;
INSERT INTO MyCollection1 VALUES ('{"BooleanField":true}');
However, when I query using:
SELECT * FROM MyCollection1 WHERE "BooleanField" = true
it does not return any rows.
How do we insert and query boolean and date fields in the doc store?
Thanks..
Hello Shrikant,
the SELECT can’t use the SQL boolean, which is known limitation. Please use this statement instead: SELECT * FROM MyCollection1 WHERE “BooleanField” = TO_JSON_BOOLEAN(true)
This is also described here: https://help.sap.com/viewer/3e48dd3ad36e41efbdf534a89fdf278f/2.0.04/en-US/680ddaa91a1145efa02a3988b67c796b.html?q=boolean
Regarding dates: dates are no datatype in JSON, so you need to do a string compare here.
Best regards, Kai
Thanks Kai, this is helpful.
Hello,
I'm having trouble with the request below and I think the problem is in link with the date field. Here is my request :
And here is my collection of documents (the order_date is just a json string) :
My goal was to obtain a table with only the most recent records (based on the "order_date" so) knowing my primary key is "product". I expected a result like this :
But I've this error :
The problem seems to be the max(date field) but I can't find how to solve this. Thank you a lot for answers or ideas !
Note, that there is also a cross post with some answers.
Hello Kai, Can we create an index for a collection?
Hello Ulasala,
no, currently not. But this feature request is already in the backlog.
Best regards,
Kai
Hello Kai,
Any release date for this feature? I find it hard to use the document store without it.
It would be also great if we could tier the collection using NSE.
Hello Leonardo,
I'm sorry but I'm not aware of any date which I can easily communicate externally, especially inside a blog.
The tiering with NSE is a complete different story, so again as far as I know it's in the backlog.
Best regards, Kai
Hello Kai,
Can we create an index for a collection? Now.
Can you please suggest any documentation if it is supported now.
Thanks, Gaurish
Hello Gaurish,
In HANA Cloud you can now create indexes: https://help.sap.com/docs/HANA_CLOUD_DATABASE/f2d68919a1ad437fac08cc7d1584ff56/ad9063aa6b6d479faac18bacb6caf145.html?locale=en-US
Best regards, Kai
Hi, Kai. Have you ever created a Spring data repository for a HANA document store? If so, can you describe it? Extending JpaRepository doesn't seem to be a good fit.
Hello Joseph,
I'm sorry but I never tried this and I don't know how this could work. What I guess is, that JPA/Spring etc. are not designed to work with JSON documents but I'm not an expert here.
Maybe it could help if you figure out how to support other well known document-oriented databases in these frameworks and then try to adapt this for the DocStore.
Best regards,
Kai
Thanks, Kai. There is repository support in Spring for MongoDB (org.springframework.data.mongodb.repository.MongoRepository). I'll give it a try.
Hi Kai,
Even I am searching for HANA JSON Document Store compatibility with Spring Boot Data. Mongo, Cassandra and other such NoSQL database has there own repositories.
Do we have similar repo as well?
Thanks,
Praveen
Hi Praveen,
I'm not aware of anything reg. this.
Regards, Kai
Hi Kai,
Thanks for quick reply.
I want to use HANA JSON Doc Store in my Spring Boot App and I would like to know how I can achieve all database operations (creating collections, CRUD operations) using Spring framework via Java code.
Thanks
Kai,
I did follow SAP Help documentation. I actually get below error in AMDOP class. However I am able to do all CRUD in SQL console on HANA Studio.
“DEMOCOLLECTION” is unknown. ABAP objects and DDIC objects must be declared in the METHOD statement. Local names must start with “:” here.
This error comes when accessing from system schema(example SAPCAR or SAPSLT etc) however works fine with user schema(example KDASS)
Regards,
Kevin Dass
Hello Kevin,
I've seen you asked the same question also here. I guess the question is the more right place to ask. I'm not an ABAP expert but I guess you need to make the collection somehow visibel for ABAP, e.g. in the DDIC. Please follow up in the question since there somebody already figured out how to do that.
Best regards, Kai
Hi Kai,
We are trying to apply the doc store as a quick ingestion mechanism for the result of REST calls. Then later on parsing the json for the data which are needed in these particular use cases. I have a couple of questions:
a) During insert if the JSON is invalid - then SQL error 8 is raised "Invalid Argument" - I cannot make an exit handler for this error code as it is not supported - are there any ways in which I can validate the json stream prior to saving.
b) when using JSON_TABLE we are able to use nested paths for dealing with arrays in a "generic way" Is something similar when using the doc store directly.
/Bruhn
Hi Michael,
a) not that I know. HANA expects that the JSON you send is valid
b) no, this is not possible but we are aware that this would be useful
Best regards, Kai
I seem to be able to import single json documents but not when its a array of json objects
I can only insert when I break it into
also for python insert , is this the only way ?
Hello Rajarshi,
this is a normal SQL limitation. The SQL interface normally can only insert one row at once which applies to documents, too (as far as I know).
Normally all clients support batch/bulk insert which can be used for documents too. Please refer to executemany here.
Best regards,
Kai
Hi Kai ..
thanks for clarifying
For normal data, I move the data into tuples.
but not sure how to do that for document store so that I can call executemany().
currently I am doing this ,
Hi,
you call executemany as you would do for a table
thanks, that helped
Hello Kai,
Does JDBC PreparedStatement work on the HANA document store queries?
It gives me the following error when using the java PreparedStatement:
On the other hand, it works with the plain SQL "Statement" in java.
The query which I am using for my prepared statement is as follows:
Could you please guide me here?
Best,
Khavya
Hello,
can you please provide a short example in Java?
Thanks, Kai
Sure.
I am trying to use PreparedStatement and set the parameters in the query as follows:
Thanks a lot. Bad news: the document store doesn't support parameters/prepared statements if you running on SAP HANA on-premise or SAP HANA Service.
See here: The Document Store does not support SQL parameters. This may affect the application's ability to prevent SQL injection.
HANA Cloud supports this feature (see here): Document Store supports the usage of SQL Parameters.
When I use the doc store within SQL anonymous block , select delete etc works
DO BEGIN
DECLARE ID NVARCHAR(20) ;
ID = '1';
SELECT * FROM XCOLLECTION X WHERE X."SID" = :ID ;
END
But the same code does not work for select insert or delete when I put it inside a stored procedure or function.
I also tried explicit CAST(:ID as NVARCHAR)
Hi,
so on HANA SPS05 it works fine:
Result is {"SID": "1"}. Can you please share your whole scenario and the HANA version?
Thanks, Kai
My ,mistake , I did not clarify the nuances .
It works when I assign the value hardcoded , but normally we read from a table and assign , thats where it fails . e.g
The above gives
Error: (dberror) [7]: feature not supported: "CDE_4"."FOO": line 15 col 1 (at pos 88): WHERE clause with unsupported expressions on collection tables
while
returns nothing
That's beause in the second "ALTER PROCEDURE" you still need to use ":ID" instead of "ID". However, returning nothing is weird and misleading. 'll raise internally attention to this.
The second behaviour can be explained: If there is no colon, "ID" gets interpreted as an attribute of the collection. If a document does not contain the attribute, it returns "null". So, the where-condition is probably equivalent to "WHERE X.SID IS NULL" and thus returns no results.
Thanks for the explanation!
I just verified on my HANA Cloud instance that the following code
worksdoes not work (EDIT: it really does not):Which HANA version do you use?
Thanks for the answer . I am using HANA 2.0 SP 5 on Premise version . It does not work there .
For the moment, I am using the work-around
I just checked a second time and I have to take my statement back as I did a mistake when testing previously: The usage of variables is also not supported on HANA Cloud. So, the code won't work there either.
For the moment, your workaround seems like a viable alternative. Another option would be to use a join between the relational table and the collection as described in this blog.
Hi Mathias
Thanks for the blog link , and I saw the bind_as_value statement , and that works
Thanks for lining this out. I didn't realize it's exactly describing your scenario 😊