SAP Cloud Platform Functions: Observations on Appl...

vadimklimov · ‎10-14-2018

Intro

During recent SAP TechEd event, beta availability of SAP Cloud Platform Functions (SCP Functions) service has been announced, and we can already read blogs that describe the service(for example, blog written by joerg.singler and blog written by karsten.strothmann) and blogs here and here written by hemchander that contain demonstrations on how functions can be implemented using this service, as well as we can explore the service and gain hands on experience by enabling it in a SCP tenant following instructions described by elisabeth.riemann in her blog.

This blog is not about going into details of implementation of custom functions using SCP Functions service – as mentioned above, there are already informative blogs available on this topic, and SAP Help contains documentation for developers on a programming model and examples on how to make use of the service. Instead, I would like to share observations and thoughts on what fuels the service, which framework components are used on application runtime level and how we can get some statistics on functions usage and use built-in metering capabilities. The one might question rationale behind this type of analysis. All in all, isn’t one of fundamental principles of serverless architecture and Function-as-a-Service (FaaS) model is higher level of developer abstraction and focus on a specific implemented function, and not on an entire built application and application runtime, where an application is deployed to and executed? That is true, and no doubt, developer attention and efforts shall be focused on developing functions, not worrying about runtimes and their configuration, to conform to ideas and principles of FaaS model. Argumentation to support my motivation of writing this blog is simple – human curiosity and attempt to get better knowledge of tools that stand behind. Intention is not to break principles of FaaS model or deviate from them, but to gain better understanding about some underlying layers – this might turn to be useful when analyzing certain types of technical issues (for example, abnormal and unexpected request processing behaviour) and gather basic statistics and metering information related to usage of the deployed function to get insight into function health and performance.

In this blog, I use HTTP trigger for a deployed function, and implement a function in such a way that it returns collected execution details back to a consumer (I don't use logging in a function, although it is a reasonable alternative) – doing so, it is convenient for me to use an HTTP client (here, I use Postman) to trigger function and analyse its output straight away. For a demo case later in the blog when I need to produce some load and trigger the function several times in a short period of time to illustrate metering capabilities, I use Apache JMeter.

Application runtime components

Currently SCP Functions service supports Node.js runtime, so let’s start from collecting very general data about runtime and environment and check details of a Node.js process that is started for the deployed function and in which context a function is executed. In Node.js, we can gain details about this by accessing various properties and methods of a global object 'process'.

Firstly, a function that returns the entire content of a process object, is implemented:

Content of a process object gives us a lot of food for thought and analysis:

While we can get a lot of additional information from a process object, I would like to attract attention to few properties that are important for our further analysis – 'process.mainModule' and 'process.argv':

A property 'process.mainModule' contains information about an entry script that is started by Node.js runtime. A script name is stored in a property 'process.mainModule.filename'. Along with that, it is also useful to check a property 'process.argv' that contains array of arguments that are passed when a Node.js process is launched. These two provide evidence that underlying serverless functionality is provisioned using Kubeless – Kubernetes native servlerless framework. One of runtimes that are supported by Kubeless, is Node.js – documentation on it can be found on Kubeless site, and implementation is available in Kubeless repository at GitHub.

Kubeless runtime for Node.js is based on usage of Express web framework. This can also be concluded by examining headers that were returned in HTTP response more precisely – one of response headers is 'X-Powered-By' and it has value 'Express', creation of this header in response is a default behaviour of Express framework:

As mentioned in intro to this blog, knowledge about utilized frameworks can become useful in troubleshooting of some issues that are related to specifics of those frameworks or the way how they are used – for example, the way how Express handles and processes requests, or how Kubeless makes use of Express. Let me provide an example to support this statement and make it more illustrative. In sake of demo, I create an echo function – it echoes body that it received by sending it back to a caller in response message. In sake of simplicity, the function is limited to processing only textual content, so we will not examine cases with handling binary content, although they are finely supported by the service (SCP Functions service is content payload agnostic) and can be faced in real life scenarios:

Now we put this function under test. To start from, let’s send a JSON message (content type 'application/json').

Next, let’s send a message of some other content type – let it be an XML message (content type 'application/xml'), although similar behaviour will be faced for other content types:

It can be noticed that something went wrong, and now response body doesn't conform to our expectations. To figure out root cause of such behaviour, we get back to documentation and implementation of Kubeless for Node.js runtime and Express – from there, we can see that Kubeless uses an Express's specific JSON built-in parser for request body data of JSON type, but for all other types, it uses a generic built-in parser for raw body data (which considers request body data type as if it would have been binary – content type 'application/octet-stream' – and creates an object of type Buffer for it):

Hence, what we got in response body when we sent XML message to a function, was hexadecimal representation of XML content – this can be checked by decoding data contained in Buffer object back from hexadecimal to text. With these findings, we can now adjust code of the echo function in such a way that it properly treats type of body data and applies decoding from Buffer object if required. In provided adjusted example of function implementation, body data type is checked – alternatively, content type of a request can be checked and for any content types different from 'application/json', decoding from Buffer object shall be applied for textual data, unless we would like to process it in binary mode:

With this enhancement in place, XML message processing (or, generally speaking, processing of messages of any text types) by the echo function is now in line with expectations:

Built-in metering capabilities

Let's extend our knowledge about components and frameworks used by SCP Functions service. Since Kubeless is used as a serverless framework, we can further check its implementation for Node.js or its documentation – one way or another, this brings us to another feature that can become relevant for those who are looking for additional information about health of a function and statistics on function usage: Kubeless uses Prometheus client for Node.js (module 'prom-client') to collect some statistics about invoked functions. This can be checked and verified from imports section of Kubeless module that we have already looked into above, and a helper module that supports it. In particular, the client collects metrics about total number of calls of a function and number of calls that ended with exceptions (this ratio is useful for calculation of function's failure rate), number of calls of a function split by call’s duration intervals (this is useful for function's performance analysis).

Prometheus client exposes a couple of web APIs that can be consumed by external tools to collect mentioned information:

/healthz – health check API,

/metrics – metrics API.

Although it is possible to use custom tools to query APIs of Prometheus client, parse and process returned metering information, it is worth mentioning that Prometheus infrastructure goes beyond clients and consists not only of clients that collect data, but also of Prometheus server that can query corresponding endpoints of Prometheus clients of running Node.js applications, persist collected data, visualize it and enable construction and execution of custom queries against collected data. More detailed information about Prometheus infrastructure, as well as client and server implementations for it, can be found in the official documentation. Here, to provide a basic illustration of usage of Prometheus, I run Prometheus server locally and collect data from a Node.js application that hosts the deployed function in SCP Functions service:

Is it important to note that process of collection and aggregation of metering information that is performed by Prometheus client, is initialized at startup of a Node.js application that is instrumented with Prometheus client. As a result, values of metrics that are collected and presented by metrics API, are reset on every restart of a Node.js application, unless they are persisted elsewhere. Given one of fundamental principles of serverless architecture is statelessness, persistence of metrics data shall be handled on a consumer side or infrastructure side (Prometheus server or its equivalent), and not by SCP Functions' Node.js applications.

Resource provisioning and application runtime startup strategy

Here we come to a consequential question of how (or, to be more precise, when) Node.js runtime is provisioned (and, correspondingly, deprovisioned, or revoked) to a deployed function and which startup strategy is utilized. Generally speaking, there are two conceptual strategies for resources provisioning and startup in serverless infrastructures: cold start and warm start.

Cold start is a strategy of provisioning resources to a function at a time of function invocation. This includes allocation of required infrastructure resources, preparation of environment, startup of required underlying runtimes and components, deployment of function code and bringing it to a state of readiness for function execution, and concludes by actual function execution. After function successfully completed its execution or terminated, corresponding application instance is gracefully shut down and disposed, that is followed by resources deprovisioning and revocation. There are certain advantages of this approach – one of major ones among them is more efficient resources utilization: resources are only provisioned when they are needed and can be re-allocated to other functions / tasks when function is not used (for example, by scaling to zero resources for a deployed function). On the other side, there is performance implication, negative impact on overall function execution experience, as when a call to a function arrives, total response time on a server side will include not only pure function execution time, but also resources provisioning and application runtime startup time, which is an overhead. Serverless architectures commonly benefit from usage of containerization techniques for application runtime provisioning, and overhead caused by application startup time will heavily depend on specifics of runtime and used frameworks. As a reference, Node.js runtime startup benchmarks are done on a regular basis and published here. It is worth noting that Kubeless doesn’t use many dependencies and modules that have to be loaded before a deployed function execution, which helps underlying function agnostic application components to remain lightweight. Although there are benchmark tests that demonstrate some alternative Node.js web frameworks – such as Koa – to have lighter footprint in a Node.js application and excel Express at performance, Express tends to remain one of the most popular web frameworks and de facto web framework for Node.js applications, thanks to its maturity and large community that supports development of new middleware components for it.

Warm start strategy, in contrast to cold start, is based on the idea of preserving resources and runtime that are required for function execution, even when function is not actively called. Such approach helps to reduce total execution time, as application is already up and running by time when a call to a function arrives. Underside of this approach is necessity of reservation of resources for deployed functions even when functions are not used.

There is also a space to variations that are based on two basic strategies described above. For example, variation to described cold start can imply delayed resources deprovisioning: to achieve better performance for highly loaded functions that receive large number of calls in a short period of time, deprovisioning can be triggered not immediately after every call to a function is completed, but after some idle time. One of variations to described warm start can be resources provisioning on a first call to a function as opposite to instant startup upon function deployment.

Now, let's come to a practical part of this section and determine strategy that is used by SCP Functions service. For this, I make use of yet another information that a process object can provide – process uptime. This can be retrieved using method process.uptime() that returns uptime of a Node.js application with the accuracy of millisecond:

Next, we deploy the function and trigger it – as it can be seen, application’s uptime is approximately 5 seconds:

Now we leave the deployed function for a while, and after some time has elapsed (around half an hour), trigger the function again – this time, application’s uptime is already almost 1812 seconds:

From these observations, it can be concluded that:

Resources are provisioned and a Node.js application that handles a function is started immediately after a function is deployed. Otherwise, uptime indicated in the first call would be significantly less and would only contain resource provisioning and corresponding Node.js application startup time.

Resources are not deprovisioned upon completion of function execution, but a Node.js application remains up and running. Otherwise, uptime indicated in the second call would be significantly less and would not exceed uptime indicated in the first call by half an hour, which was a difference between first and second calls. Such difference in uptimes observed in first and second calls indicates that Node.js application was running all the while.

Another indicator that supports the same finding and conclusion, is host, on which an application is running. This can be checked from properties of a process object – particularly, from environment property 'process.env.HOSTNAME'. On one of earlier screenshots, when we looked into arguments that were passed when Node.js process was launched (property 'process.argv'), 'process.env.HOSTNAME' has also been seen there – host name starts from a function name, followed by some generated character sequence.

It shall be noted here that any redeployment of a function will cause restart of a corresponding Node.js application. This can be illustrated by triggering deployment of the function above (even without making any changes to code or properties of the function) and checking application's uptime – as it can be seen, after function redeployment, application uptime dropped down to few seconds again:

When a function is redeployed, host name is also likely to change, as a corresponding Node.js application that handles a function is going to be started in another container – this can be illustrated by bringing back version of a function that displayed content of a process object, deploying and triggering it:

Summary

To summarize list of main application runtime components that are used by SCP Functions service when provisioning runtime and service capabilities to deployed functions, let me illustrate it on the below diagram:

Outro

Given SCP Functions are currently provisioned as a service in Cloud Foundry environment of SAP Cloud Platform, I would like to mention here that Cloud Foundry Foundation members and the entire Cloud Foundry ecosystem made commitment to further development and maturing Cloud Foundry Container Runtime (CFCR) that is based on provisioning containerized runtime powered by Kubernetes, as an alternative to existing Cloud Foundry Application Runtime (CFAR). This has been one of key topics for sessions and discussions on a recent Cloud Foundry Summit Europe event, and new projects are to appear in the near future. As it has been indicated, SCP Functions service is based on Kubeless, and given fast pace of Kubernetes adoption in Cloud Foundry in general and in SAP Cloud Platform in particular, we might expect further enhancements of SCP Functions service, so stay tuned.