Technical Articles
Improved Search Performance in SAP BTP Core, SAP Cloud Application Programming Model, Node.js Runtime
Introduction
In SAP Cloud Application Programming Model CDS version 5, we’ve enabled a new search capability in the Node.js Runtime that:
- Allows you to tailor the search behavior to your needs via the new supported
@cds.search
annotation - Improve the performance of search requests by favoring performance over convenience — only elements typed as
String
are searchable by default. - Layout the foundation for deep search — search on entity level and substructures — and fuzzy search
In addition, the following improvements were made:
- Refactored and modularized the search implementation to decouple database-specific processing from the OData service layer
- Caching of metadata
- Improved performance of search queries for localized data.
Tailored Search with @cds.search
Annotation
The new search capability supports the @cds.search entity-level annotation to allow you to include or exclude elements from the set of searchable elements in search queries. The annotation can contain paths to elements, including elements of an associated entity using the path expression syntax. Just annotate those elements, which shall be searched, and which not.
In contrast to cds version 4 — to favor performance over convenience — only elements typed as String
are searchable by default unless you decide to override this default. The reasoning behind this design decision is that most of the time, you’re searching for elements typed as String
rather than for elements typed as UUID
, Date
, Time
, DateTime
, or Timestamp
.
The new mantra is that you’ve to be explicit about what is searchable so that we can simplify and optimize the generated SQL queries for the underlying database, thus improving performance.
Common Usages
The @cds.search
annotation looks as follows:
@cds.search: {
element1,
element2: true,
element3: false,
element4.elementA
}
entity MyEntity ...
Here element1
, …, element4
are the element names of an annotated entity. To exclude an element from the search, annotate it with false
.
To make an element (typed other than String
) searchable via the $search
query option, the element’s entity must be explicitly annotated with the @cds.search
annotation, for example:
Search by ID
and title
:
@cds.search: { ID, title }
entity Books {
key ID: UUID;
title: String;
}
Search by all elements of type String
, title
, descr
, and isbn
— default, no annotation required:
entity Books {
key ID: UUID;
title: String;
descr: String;
isbn: String;
author: Association to Authors;
}
Search by title
:
@cds.search: { title }
entity Books { ... }
@cds.search: { title: true }
entity Books { ... }
Search by title
and descr
:
@cds.search: { title, descr }
entity Books {
key ID: UUID;
title: String;
descr: String;
isbn: String;
author: Association to Authors;
}
Search by title
, descr
, and isbn
:
@cds.search: { title, descr, isbn }
entity Books {
key ID: UUID;
title: String;
descr: String;
isbn: String;
author: Association to Authors;
}
Search by all elements of type String
excluding the element isbn
(search by title
and descr
):
@cds.search: { isbn: false }
entity Books {
key ID: UUID;
title: String;
descr: String;
isbn: String;
author: Association to Authors;
}
Search by ID
and title
:
@cds.search: {
ID: true,
title: true,
descr: false // superfluo (just for demostration purposes)
}
entity Books {
key ID: UUID;
title: String;
descr: String;
isbn: String;
author: Association to Authors;
}
Note: the element-level annotation @Search.defaultSearchElement
is deprecated and will be removed in future versions. In the meantime, @cds.search
supersedes @Search.defaultSearchElement
. That is, if both annotations are defined, @cds.search
wins.
Metadata Caching
OData Service Layer Refactored
There are scenarios in which search requests are forwarded to external services and therefore don’t require database-specific processing on the primary service that processes and forwards the search request. In CDS version 4, some parts of the database handling were tightly coupled to the OData service layer.
As an undesired side effect, in the worst-case scenario, the runtime couldn’t fulfill search requests because of unexpected conditions, or the exact database processing was done twice (once for the primary service and once for the external service).
In CDS version 5, we refactored and modularized the $search
system query option to decouple database-specific processing from the OData service layer, thus resulting in better performance when forwarding search requests to external services and fewer runtime exceptions.
Improved Search Performance for Localized Data
In the SAP Cloud Application Programming Model, there are two approaches for resolving translated texts:
- Via localized views (former approach)
- At runtime
Resolving Localized Data Via Localized Views
In the former approach, when a data model is compiled to SQL format, localized views are generated, and search queries are redirected to those views based on the user locale.
The user locale is advertised by HTTP clients, for example, via the Accept-Language header in HTTP requests or via the sap-language
| x-sap-request-language
URL’s parameters. Once the Node.js Runtime by content negotiation selects a user locale proposal, it uses the corresponding ahead-of-time compiled localized view with the user locale to resolve the localized texts.
For example, given the following service definition:
service BookshopService {
// search by title (default)
@cds.search: { title }
entity Books {
key ID : UUID;
title : localized String;
}
}
After compiling the BookshopService
service definition to SQL format, the BookshopService_Books
, BookshopService_Books_texts
database tables are generated.
Notice that the generated BookshopService_Books_texts
table doesn’t have a counterpart in the service definition. By convention, when an entity contains localized elements, the compiler generates an additional table with the same name suffixed with _texts
, including the following columns:
- An
ID
column - A
locale
column - A column for every localized element defined in the root entity (
Books
in this case)
In addition, the localized_BookshopService_Books
view is generated in SQL DDL to read localized data with fallback easily.
╰─$ cds compile srv/books.cds --to hdbtable
COLUMN TABLE BookshopService_Books (
ID NVARCHAR(36) NOT NULL,
title NVARCHAR(5000),
PRIMARY KEY(ID)
) WITH ASSOCIATIONS (
MANY TO MANY JOIN BookshopService_Books_texts AS texts ON (texts.ID = ID),
MANY TO ONE JOIN BookshopService_Books_texts AS localized ON (localized.ID = ID
AND localized.locale = SESSION_CONTEXT('LOCALE'))
)
COLUMN TABLE BookshopService_Books_texts (
locale NVARCHAR(14) NOT NULL,
ID NVARCHAR(36) NOT NULL,
title NVARCHAR(5000),
PRIMARY KEY(locale, ID)
)
VIEW localized_BookshopService_Books AS SELECT
L_0.ID,
coalesce(localized_1.title, L_0.title) AS title
FROM (
BookshopService_Books AS L_0
LEFT JOIN BookshopService_Books_texts AS localized_1
ON localized_1.ID = L_0.ID
AND localized_1.locale = SESSION_CONTEXT('LOCALE')
)
WITH ASSOCIATIONS (
MANY TO MANY JOIN BookshopService_Books_texts AS texts ON (texts.ID = ID),
MANY TO ONE JOIN BookshopService_Books_texts AS localized ON (localized.ID = ID
AND localized.locale = SESSION_CONTEXT('LOCALE'))
)
Notice that the generated SQL DDL used to read localized texts conveniently uses the coalesce
function to filter the localized title
field.
According to research from the colleagues of the SAP Cloud Application Programming Model Java Runtime and Robin de Silva Jayasinghe, the coalesce
function prevents push-down of filter operations to the SAP HANA database column store layer. As a result, search queries over large data sets show suboptimal performance and might lead to out-of-memory issues in some high-load scenarios.
Although the approach with the localized views is convenient, it’s limited on SQLite, and on SAP HANA database shows suboptimal performance with large data sets and last but not least prevents fuzzy search capabilities.
Thus, the alternative is to resolve localized data at runtime.
Resolve Localized Data At Runtime
To improve the performance of search queries over localized data, we implemented a mechanism to resolve localized data at runtime rather than over the localized view.
Note: This is an experimental feature toggle that can be enabled by setting the environment variable cds.env.features.optimized_search
to true
.
We use the localized
association in schema notation format to generate SQL statements that resolve the localized data in a way that is optimized for the underlying database. This is, however, only possible if the localized
association is present, with is always the case when at least one localized element is defined in the target entity.
Compiling a sample Books
model to schema notation:
╰─$ cds compile srv/books.cds
{
"definitions": {
"BookshopService": {
"kind": "service"
},
"BookshopService.Books": {
"kind": "entity",
"@cds.search.title": true,
"elements": {
"ID": {
"key": true,
"type": "cds.UUID"
},
"title": {
"localized": true,
"type": "cds.String"
},
"texts": { // points to all translated texts for the given entity
"type": "cds.Composition",
"cardinality": {
"max": "*"
},
"target": "BookshopService.Books.texts",
"on": [
{
"ref": [
"texts",
"ID"
]
},
"=",
{
"ref": [
"ID"
]
}
]
},
"localized": { // association points to the translated texts and is narrowed to the request’s locale
"type": "cds.Association",
"target": "BookshopService.Books.texts",
"on": [
{
"ref": [
"localized",
"ID"
]
},
"=",
{
"ref": [
"ID"
]
},
"and",
{
"ref": [
"localized",
"locale"
]
},
"=",
{
"ref": [
"$user",
"locale"
]
}
]
}
}
},
"BookshopService.Books.texts": {
"kind": "entity",
"@odata.draft.enabled": false,
"elements": {
"locale": {
"key": true,
"type": "cds.String",
"length": 14
},
"ID": {
"@odata.containment.ignore": true,
"key": true,
"type": "cds.UUID"
},
"title": {
"localized": false,
"type": "cds.String"
}
}
}
},
"meta": {
"creator": "CDS Compiler v2.2.0",
"flavor": "inferred"
},
"$version": "2.0"
}
Notice that the localized
association points to the translated texts and is additionally narrowed to the request’s locale. In addition, the texts
composition points to all translated texts for the Books
entity.
Utilizing these schema data structures and the user locale, we join the BookshopService_Books
and BookshopService_Books_texts
database tables at runtime, avoiding the performance overhead of the coalesce
function when filtering localized data.
Search Queries Based on the CONTAINS
Predicate
In contrast to the LIKE
predicate, you can use the CONTAINS
predicate in the WHERE
clause of a SELECT
statement to search for one or multiple columns tables.
The syntax of the CONTAINS
predicate looks as follows:
CONTAINS (<contains_columns>, <search_string> [, <search_specifier>])
Similar to an Internet search engine, you can search for multiple terms in the same query, for example:
SELECT * FROM <table_name> WHERE CONTAINS ((<column1>, <column2>, <column3>), <search_string>)
According to our research, the LIKE
clause might cause a high CPU load on SAP HANA servers in some scenarios because the database optimizer can’t push down the LIKE
expression to the column storage layer.
As a result, search queries over large data set are slower than search queries based on CONTAINS
.
To further improve the performance of search queries, we decided to base the $search
implementation on CONTAINS
. Unfortunately, search queries based on CONTAINS
aren’t strictly comparable to those based on LIKE
(see CONTAINS
limitations). So, we implemented a branch prediction algorithm to determine whether to use CONTAINS
at runtime for search queries instead of LIKE
.
Note: This is an experimental feature toggle that can be enabled by setting the environment variable cds.env.features.optimized_search
to true
.
Key Takeaways
In SAP Cloud Application Programming Model CDS version 5, we improved the performance of search queries (some optimizations are enabled by default, others can be enabled by setting the environment variable cds.env.features.optimized_search
to true
) and provided support for the @cds.search
entity-level annotation.
To ask questions about the SAP Cloud Application Programming Model in SAP Community, you can use this Q&A tag link.