Skip to Content
Technical Articles
Author's profile photo Arley Triana Morin

Improved Search Performance in SAP BTP Core, SAP Cloud Application Programming Model, Node.js Runtime

Introduction

 

In SAP Cloud Application Programming Model CDS version 5, we’ve enabled a new search capability in the Node.js Runtime that:

  • Improve the performance of search queries
  • Allows you to tailor the search behavior to your needs via the new supported @cds.search annotation
  • Layout the foundation for deep search (search on entity level and substructures/elements) and fuzzy search

 

 

The new search capability supports the @cds.search entity-level annotation to allow you to include or exclude elements from the set of searchable elements in search queries. The annotation can contain paths to elements, including elements of an associated entity using the path expression syntax. Just annotate those elements, which shall be searched, and which not.

In contrast to cds version 4 — to favor performance over convenience — only elements typed as String are searchable by default unless you decide to override this default. The reasoning behind this design decision is that most of the time, you’re searching for elements typed as String rather than for elements typed as UUID, Date, Time, DateTime, or Timestamp.

The new mantra is that you’ve to be explicit about what is searchable so that we can simplify and optimize the generated SQL queries for the underlying database, thus improving performance.

Common Usages

 

The @cds.search annotation looks as follows:

@cds.search: {
    element1,
    element2: true,
    element3: false,
    element4.elementA
}
entity MyEntity ...

Here element1, …, element4 are the element names of an annotated entity. To exclude an element from the search, annotate it with false.
To make an element (typed other than String) searchable via the $search query option, the element’s entity must be explicitly annotated with the @cds.search annotation, for example:

Search by ID and title:

@cds.search: { ID, title }
entity Books {
    key ID: UUID;
    title: String;
}

Default, no annotation required (search by all elements of type String, title, descr, and isbn):


entity Books {
  key ID: UUID;
  title: String;
  descr: String;
  isbn: String;
  author: Association to Authors;
}

Search by title:


@cds.search: { title }
entity Books { ... }

@cds.search: { title: true }
entity Books { ... }

Search by title and descr:


@cds.search: { title, descr }
entity Books {
  key ID: UUID;
  title: String;
  descr: String;
  isbn: String;
  author: Association to Authors;
}

Search by title, descr, and isbn:


@cds.search: { title, descr, isbn }
entity Books {
  key ID: UUID;
  title: String;
  descr: String;
  isbn: String;
  author: Association to Authors;
}

Search by all elements of type String excluding the element isbn (search by title and descr):


@cds.search: { isbn: false }
entity Books {
  key ID: UUID;
  title: String;
  descr: String;
  isbn: String;
  author: Association to Authors;
}

Search by ID and title:


@cds.search: {
  ID: true,
  title: true,
  descr: false // superfluo (just for demostration purposes)
}
entity Books {
  key ID: UUID;
  title: String;
  descr: String;
  isbn: String;
  author: Association to Authors;
}

Note: the element-level annotation @Search.defaultSearchElement is deprecated and will be removed in future versions. In the meantime, @cds.search supersedes @Search.defaultSearchElement, that is, if both annotations are defined, @cds.search wins.

Caching

 


The metadata information of which elements are searchable is primarily base on Core Data Services data model entity definitions. We use the SAP Cloud Application Programming Model compact representation of data models known as Core Schema Notation to determine which elements are searchable for an entity at runtime.
Based on the premise that Core Schema Notation is a static data structure representation of a Core Data Services data model or service definition that doesn’t change during the lifecycle of an application. We implemented a caching mechanism that lazily (for the first search request only) computes the searchable elements and stores (cache) the result so that future search requests to the same entity are served faster.

 

OData Layer Refactored

 

There are scenarios in which search requests are forwarded to external services and therefore don’t require database-specific precessing. In CDS version 4, some parts of the database handling were tightly coupled to the OData layer.
As an undesired side effect, in the worst-case scenario, the runtime couldn’t fulfill search requests because of unexpected conditions, or the exact database processing was done twice (once for the primary service and once for the external service).

In CDS version 5, we refactored and modularize the $search system query option to decouple database-specific precessing from the OData layer, thus resulting in better performance when forwarding search requests to external services and fewer runtime exceptions.

 

Improved Search Performance for Localized Data

 

In the SAP Cloud Application Programming Model, there are two approaches for resolving translated texts:

  1. Via localized views (former approach)
  2. At runtime

 

Resolving localized texts via localized views

 

In the former approach, when a data model is compiled to SQL format, localized views are generated, and search queries are redirected to those views base on the user locale.
The user locale is advertised by HTTP clients, for example, via the Accept-Language header in HTTP requests or via the sap-language | x-sap-request-language URL’s parameters. Once the Node.js Runtime by content negotiation selects a user locale proposal, it uses the corresponding ahead-of-time compiled localized view with the user locale to resolve the localized texts.

For example, given the following service definition:

service BookshopService {

  // search by title (default)
  @cds.search: { title }
  entity Books {
    key ID : UUID;
    title  : localized String;
  }
}

 

After compiling the BookshopService service definition to SQL format, the BookshopService_Books, BookshopService_Books_texts database tables are generated.
Notice that the generated BookshopService_Books_texts table doesn’t have a counterpart in the service definition. By convention, when an entity contains localized elements, the compiler generates an additional table with the same name and suffixed with _texts, including the following columns:

  • An ID column
  • A locale column
  • A column for every localized element defined in the root entity (Books in this case)

In addition, the localized_BookshopService_Books view is generated in SQL DDL to read localized texts with fallback easily.

╰─$ cds compile srv/books.cds --to hdbtable

COLUMN TABLE BookshopService_Books (
  ID NVARCHAR(36) NOT NULL,
  title NVARCHAR(5000),
  PRIMARY KEY(ID)
) WITH ASSOCIATIONS (
  MANY TO MANY JOIN BookshopService_Books_texts AS texts ON (texts.ID = ID),
  MANY TO ONE JOIN BookshopService_Books_texts AS localized ON (localized.ID = ID
  AND localized.locale = SESSION_CONTEXT('LOCALE'))
)

COLUMN TABLE BookshopService_Books_texts (
  locale NVARCHAR(14) NOT NULL,
  ID NVARCHAR(36) NOT NULL,
  title NVARCHAR(5000),
  PRIMARY KEY(locale, ID)
)

VIEW localized_BookshopService_Books AS SELECT
  L_0.ID,
  coalesce(localized_1.title, L_0.title) AS title
FROM (
  BookshopService_Books AS L_0
  LEFT JOIN BookshopService_Books_texts AS localized_1
  ON localized_1.ID = L_0.ID
  AND localized_1.locale = SESSION_CONTEXT('LOCALE')
)
WITH ASSOCIATIONS (
  MANY TO MANY JOIN BookshopService_Books_texts AS texts ON (texts.ID = ID),
  MANY TO ONE JOIN BookshopService_Books_texts AS localized ON (localized.ID = ID
  AND localized.locale = SESSION_CONTEXT('LOCALE'))
)

Notice that the generated SQL DDL used to read localized texts conveniently uses the coalesce function to filter the localize title field.

According to research made by the colleagues of the SAP Cloud Application Programming Model Java Runtime and Robin de Silva Jayasinghe, the coalesce function prevents push down of filter operations to the SAP HANA database column store layer. As a result, search queries over large data sets show suboptimal performance and might lead to out-of-memory issues in some high-load scenarios.

Although the approach with the localized views is convenient, it’s limited on SQLite, and on SAP HANA database shows suboptimal performance with large data sets and last but not least prevents fuzzy search capabilities.
Thus, the alternative is to resolve localized texts at runtime.

 

Resolve localized texts at runtime

 

To improve the performance of search queries over localized data, we implemented a mechanism to resolve localized data at runtime rather than over the localized view.

Note: This is an experimental feature toggle that can be enabled by setting the environment variable cds.env.features.optimized_search to true.

We use the localized association in schema notation format to generate SQL statements that resolve the localized data in a way that is optimized for the underlying database. This is, however, only possible if the localized association is present, with is always the case when at least one localized element is defined in the target entity.

Compiling a  sample Books model to schema notation:

╰─$ cds compile srv/books.cds

{
  "definitions": {
    "BookshopService": {
      "kind": "service"
    },

    "BookshopService.Books": {
      "kind": "entity",
      "@cds.search.title": true,

      "elements": {
        "ID": {
          "key": true,
          "type": "cds.UUID"
        },
        "title": {
          "localized": true,
          "type": "cds.String"
        },

        "texts": { // points to all translated texts for the given entity
          "type": "cds.Composition",
          "cardinality": {
            "max": "*"
          },
          "target": "BookshopService.Books.texts",
          "on": [
            {
              "ref": [
                "texts",
                "ID"
              ]
            },
            "=",
            {
              "ref": [
                "ID"
              ]
            }
          ]
        },

        "localized": { // association points to the translated texts and is narrowed to the request's locale
          "type": "cds.Association",
          "target": "BookshopService.Books.texts",
          "on": [
            {
              "ref": [
                "localized",
                "ID"
              ]
            },
            "=",
            {
              "ref": [
                "ID"
              ]
            },
            "and",
            {
              "ref": [
                "localized",
                "locale"
              ]
            },
            "=",
            {
              "ref": [
                "$user",
                "locale"
              ]
            }
          ]
        }


      }
    },

    "BookshopService.Books.texts": {
      "kind": "entity",
      "@odata.draft.enabled": false,
      "elements": {
        "locale": {
          "key": true,
          "type": "cds.String",
          "length": 14
        },
        "ID": {
          "@odata.containment.ignore": true,
          "key": true,
          "type": "cds.UUID"
        },
        "title": {
          "localized": false,
          "type": "cds.String"
        }
      }
    }

  },
  "meta": {
    "creator": "CDS Compiler v2.2.0",
    "flavor": "inferred"
  },
  "$version": "2.0"
}

Notice that the localized association points to the translated texts and is additionally narrowed to the request’s locale. In addition, the texts composition points to all translated texts for the Books entity.
Utilizing these schema data structures and the user locale, we join the BookshopService_Books and BookshopService_Books_texts database tables at runtime, avoiding the performance overhead of the coalesce function when filtering localized data.

 

In the SAP HANA database, in contrast to the LIKE predicate, you can use the CONTAINS predicate in the WHERE clause of a SELECT statement to search for one or multiple columns tables.

The syntax of the CONTAINS predicate looks as follows:

CONTAINS (<contains_columns>, <search_string> [, <search_specifier>])

Similar to an Internet search engine, you can search for multiple terms in the same query, for example:

SELECT * FROM <table_name>
WHERE CONTAINS ((<column1>, <column2>, <column3>), <search_string>)

According to our research, the LIKE clause might cause a high CPU load on SAP HANA servers in some scenarios because the database optimizer can’t push down the LIKE expression to the column storage layer.
As a result, search queries over large data set are slower than search queries based on CONTAINS.

To further improve the performance of search queries, we decided to base the $search implementation on CONTAINS. Unfortunately, search queries based on CONTAINS are not strictly comparable to search queries base on LIKE (see CONTAINS limitations). So, we implemented a branch prediction algorithm to decided whether to use CONTAINS at runtime for search queries instead of LIKE.

Note: This is an experimental feature toggle that can be enabled by setting the environment variable cds.env.features.optimized_search to true.

 

Key Takeaways

 

In SAP Cloud Application Programming Model CDS version 5, we improved the performance of search queries (some optimizations are enabled by default, others can be enabled by setting the environment variable cds.env.features.optimized_search to true) and provided support for the @cds.search entity-level annotation.

To ask questions about the SAP Cloud Application Programming Model in SAP Community, you can use this Q&A tag link.

Assigned tags

      Be the first to leave a comment
      You must be Logged on to comment or reply to a post.