Skip to Content
Technical Articles
Author's profile photo Martin Stenzig

CAP (Node.js) – Performance Testing and Tuning

Background

It all started with a little project I was working on to synchronize thousands of records from SuccessFactors into a HANA database.

(Yes, I know there are more automatic ways than writing something yourself, but when you know SuccessFactors you know that no SF system API’s look the same.)

The structure of my solution is as follows:

  1. I have microservice #1 that merely exposes an ODATA API on top of my generic data model. The business layer does nothing but receive data from the POST request and write it to the database.
  2. My second microservice #2 is a NodeJS application that reads the data from the SuccessFactors ODATA API and calls microservice #1 to write the data to the database.

But between the two services I want to optimize the performance and possibly have multiple instances of microservice #2 to transfer different entities at the same time.

Options

  1. Sequential
    The obvious option is to post my records sequentially to the target service. This is the most straight forward approach, but it is also the slowest.
  2. Parallel
    The second option is to post my records in parallel. It’s not a bad idea, but you quickly try to figure out how many parallel requests you can make before you start to get errors. Errors can materialize in the express stack, in the BTP as the BTP thinks you are executing a DoS attack or in the processing of the database requests as you might run out of connections in the connection pool.
  3. Batch
    The third option I thought would be my solution was a batch request. I could take all my records and post them in one request. This would be the most efficient way to transfer data, but as the name indicates and the specification states, it will receive the information as a batch, but process them sequentially on the server.
  4. BatchParallel
    A fourth option I never implemented might be a combination of option 2 and 3. You could optimize the batch size and number of parallel connections.
  5. Custom REST
    The option that the CAP development team recommended is without a doubt the best option. You can add a generic REST endpoint to your service (micrososervice #1) that will receive the data itself and the name of the entity you want to write to. Then you call the services with the complete data set. The implementation will do an optimized, single insert into the database with optimal performance and the complete dataset.

Preliminary Test Results

You can find the details here

Observations

  • SAP CAP is a great framework to provide easy ODATA access to your data model.
  • For most of you out there that use CAP as a mere backend for UI apps, you might never have to worry about this.
  • As expected, sequential single request processing is the slowest approach. The problem is amplified if you have to include network latency as a factor.
  • Utilizing batch processing or parallel processing are good ways to improve performance, but require additional effort in tuning the connection pool. As our detailed test results show, the default connection pool settings are not optimal for high volume through put and lead to various errors (getaddrinfo ENOTFOUND, 502 – Bad Gateway, 503 – Service Unavailable).
  • The custom REST endpoint approach is the fastest and most efficient approach, but requires additional effort to implement and maintain. In version 6.4 of CAP you must patch CAP to allow for larger request bodies. You can find details in the description of the Reference Server. In this implementation I skipped over proper error handling.

Test Environment

To test the performance of the different options I created a simple test environment.

  1. A reference service that allows the simulation of a CAP service with a single entity data model.
    https://github.com/RizInno/cds-load-refsrv
  2. A test app that can put some load on the reference service to simulate the load.
    https://github.com/RizInno/cds-load-test

The boundaries

  1. ECONNRESET on too many parallel requests When I increase my number of concurrent connections to >= 550 and iterations in approximately 5s cycles a few calls will execute successfully, but after a few iterations I will get an ‘ECONNRESET’ error when establishing the connnection to the server. There seems to be challenge on the express side when hitting 1000 parallel requests.
    See this StackOvervflow article for additional details: https://stackoverflow.com/questions/53340878/econnreset-in-express-js-node-js-with-multiple-requests

  2. BTP DoS attack prevention When you have microservice #2 running locally and #1 deployed to the BTP then you will run into DoS attack prevention. The BTP will block requests from the same IP at about 900 requests. This usually materializes in a client side error: getaddrinfo ENOTFOUND

  3. 503 – Service Unavailable – Service unavailable is usually an indication of the connection pool running out of connections when you rapid fire parallel requests. You can adjust the pool configuration to give you more room to maneuver. Details are described in the standard CAP Pool Configuration documentation.
  4. 502 – Bad Gateway – I have a case where I received a Bad Gateway, but I am still investigating a cause and mitigation.
  5. REST endpoint size limit in CAP (as of version 6.4) When I write approximately 200 records, I hit a current limit in the CAP REST endpoint. The request will be rejected with an error ‘PayloadTooLargeError: request entity too large’
    expected: 140910,
    length: 140910,
    limit: 102400,
    type: ‘entity.too.large’

    You can find details in the description of the Reference Server.

Assigned Tags

      3 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Mustafa Bensan
      Mustafa Bensan

      Hi Martin,

      Thanks for sharing some very useful insights here.  I have some questions as follows:

      1.  You mention that you were working on a project to "to synchronize thousands of records from SuccessFactors into a HANA database".  Given the CAP limitation of 200 records for the recommended Option 5, how did you implement this option to support processing the "thousands of records"?

      2.  In your scenario, was the HANA database on-premise or SAP HANA Cloud?

      3.  Out of curiosity, what was the use case for transferring the data from SuccessFactors into a HANA database?

       

      Regards,

      Mustafa.

      Author's profile photo Martin Stenzig
      Martin Stenzig
      Blog Post Author

      Mustafa Bensan, 

      1. I communicated the limitation to the CAP team, who committed to fixing the limitation in one of the next releases.
        In the meantime, I patched the version. It is a one liner and you can find the fix, file you need to fix and approach described in the Reference Server (https://github.com/RizInno/cds-load-refsrv) README.md
      2. SAP HANA Cloud
      3. When I started the project there were several problems with the DWC SuccessFactors synchronization and even the HANA based replication tasks. But I needed to create a data foundation to build SAC dashboards against. Happy to go into more details in a private chat if your need more.
      Author's profile photo Mustafa Bensan
      Mustafa Bensan

      Thanks so much for clarifying, Martin.  I'll be in touch for point 3.