Technical Articles
Efficient Workload management on SAP Cloud Foundry using Application Autoscaler
Introduction
In cloud applications, workload is a very important factor. Workload can be measured as CPU usage, memory utilized, response times, network traffic etc. There are mainly two types of workloads as follows:
- Static Workload: It’s the normal workload of an application. When workload is static then application behavior is always the same and stable. To manage this workload, proper resources need to be assigned to the application.
- Dynamic Workload: It’s the workload that keeps on changing overtime. In this case, workload can grow or shrink randomly or at specific times. To deal with this kind of workload, there are two options. First option is to assign a lot of resources to the application. But the problem with that is, most of time the application won’t bee using or needing that much resource. Second option is to use the Application Autoscaler service provided by Cloud Foundry.
To demonstrate the service functionalities we will be using the RandomWords application and JMeter to hit the application endpoint multiple times in parallel and cause it to auto scale. The JMeter script is also there in the same Repository.
Exploration
In this blog post we are going to go over the two types of scaling configurations that are provided by Application Autoscaler service which are dynamic scaling and schedule based scaling. But first, lets start by creating an instance of Application Autoscaler service and binding it to the RandomWords service. First deploy the RandomWords service on CF. Now, to create and bind the instance of Application Autoscaler follow this guide. Now, go to the service instance and click on “Open Dashboard”. This will open Application Autoscaler dashboard. Click on “Referencing Apps” and Edit the scaling policy for the application. After this is done, we can go ahead and get started with each of the configurations.
Dynamic Scaling
In this kind of scaling application instances are scaled up and down based on the values of various parameters like memory consumed, response time, throughput and so on. The scaling rule that we will be writing will be triggered based on CPU usage. We also have to define the minimum and maximum instance count of the application. The configuration is as follows.
{
"instance_min_count": 1,
"instance_max_count": 5,
"scaling_rules": [
{
"metric_type": "cpu",
"threshold": 70,
"stat_window_secs": 60,
"breach_duration_secs": 60,
"cool_down_secs": 60,
"operator": ">",
"adjustment": "+1"
},
{
"metric_type": "cpu",
"threshold": 70,
"stat_window_secs": 60,
"breach_duration_secs": 60,
"cool_down_secs": 60,
"operator": "<=",
"adjustment": "-1"
}
]
}
This rule states that, for the rule to execute, the application must have one instance running at minimum and the Application Autoscaler can scale the application till it has 5 instances running at maximum. The scaling rules are for CPU and states the following things.
- Whenever threshold value of CPU reaches above 70, scale the application.
- Start window seconds of 60 defines that the amount of time required to calculate the average value of the metric CPU is 60 seconds.
- Breach duration seconds of 60 defines that the amount of time required to analyze the collected data of the metric CPU is 60 seconds.
- Cool down seconds of 60 defines that the amount of time required between two successive scale triggers is 60 seconds.
- Adjustment defines what to do when the rule conditions are satisfied. There are two rules in this case. one rule is scaling up the application instances and the other one is scaling down the instances.
Like this, there are many other rules that can also be applied to this configuration. These rules can be found here. Once done, save the configuration. Now, we have to somehow increase the CPU usage of the application to above 70 and keep it there for more than 1 minute for the scaling to Trigger. To do it, we will use JMeter. The corresponding JMX file in there in the repository. Import it in JMeter, modify the URLs and Run the test. Within a few seconds the CPU usage will increase and after 1 minute the application will get scaled up to 2 instances. If we keep the test running, the number of instances will keep on increasing to a maximum of 5 as follows.
Now, stop the test and within a few minutes, the number of instances will scale down to having just 1 instance running. Like this, more rules can be provided for other metrics as well. It is always recommended to have a scale down rule along with a scale up rule for any metric like we have for the metric “CPU” in this case.
Scheduled Scaling
There may be a case in which an application gets a lot of hits on a specific date or time of the day. So, it is required to have more instances running at those times. For this there is Scheduled Scaling. In this kind of scaling application instances are scaled up or down based on preset schedules. These schedules can be on a specific date and time or recurring ones. Let’s check two examples demonstrating fixed date schedules and recurring schedules.
Date Specific Schedule
In this schedule, a list of date and time is specified in which scaling will take place. The configuration is as follows.
{
"instance_min_count": 1,
"instance_max_count": 5,
"schedules": {
"timezone": "CET",
"specific_date": [
{
"start_date": "2019-09-18",
"end_date": "2019-09-21",
"start_time": "14:00",
"end_time": "16:00",
"instance_min_count": 1,
"instance_max_count": 5,
"initial_min_instance_count": 2
}
]
}
}
This configuration states that the rule will be executed at a specific time frame between 18th of September and 21st of September from 14:00 hours to 16:00 hours. Instance minimum and maximum counts are also provided which overrides the same mentioned globally. Initial min instance count 2 specifies that, at the start of the schedule the number of instances will become 2 and then gradually the instances will keep on increasing till 16:00 hours. The timezone mentioned in the schedule is Central European Time (CET). There are other parameters as well which can be configured but not mentioned here for simplicity.
Recurring Schedule
In this schedule, a list of recurring rules are specified and each of those rules contain a specific time at which the scaling should be performed. Here’s the configuration for the recurring schedule.
{
"instance_min_count": 1,
"instance_max_count": 5,
"schedules": {
"timezone": "CET",
"recurring_schedule": [
{
"start_date": "2019-09-18",
"end_date": "2019-09-21",
"start_time": "14:00",
"end_time": "16:00",
"days_of_week": [
2,
5,
6
],
"instance_min_count": 1,
"instance_max_count": 5,
"initial_min_instance_count": 2
}
]
}
}
The configuration above is very similar to the one in date specific schedule. Only addition is the days of week. The configuration states that the application instance will be scaled between 18th of September and 21st of September on 2nd, 5th and 6th days of the week from 14:00 hours to 16:00 hours.
Custom Metrics
Application Autoscaler service also provides the option of defining custom metrics based rules along with the standard metric rules already available. To use the custom metrics, first we have to define the custom metric based policy and then use the metrics API to change the value of those metrics. First, we edit the configuration as follows.
{
"instance_min_count": 1,
"instance_max_count": 5,
"scaling_rules": [
{
"metric_type": "testmetric",
"threshold": 60,
"stat_window_secs": 60,
"breach_duration_secs": 60,
"cool_down_secs": 60,
"operator": ">=",
"adjustment": "+1"
},
{
"metric_type": "testmetric",
"threshold": 60,
"stat_window_secs": 60,
"breach_duration_secs": 60,
"cool_down_secs": 60,
"operator": "<",
"adjustment": "-1"
}
]
}
In this configuration we have defined a rule where value of the custom metric “testmetric” is checked and if the value is greater than 60 then the application instance is scaled. Similarly we also have a scale down rule for the same. Next we have to use the metrics API to push changes for the “testmetric” metric. To do this, we need the custom metric API url and authentication parameters which can be found if we run “cf env randomwords” or open the Application Autoscaler instance and select “randomwords” application. Following things should be available.
"custom_metrics": {
"username": "<username>",
"password": "<password>",
"url": "https://autoscaler-metrics.<url>"
}
Now, make a post call to the endpoint “<URL>/v1/apps/<application_guid>/metrics” with basic authentication providing username and password. The application guid can be found in the VCAP_APPLICATION object when we execute “cf env randomwords”.The body of the post call will contain the value of the custom metric in the following format.
{
"instance_index":0,
"metrics":[
{
"name":"testmetric",
"value":10,
"unit":"%"
}
]
}
This will set the value of the custom metric to 10 percent. Now, if we increase the value of this metric to 60 or more then the application will scale up and vice versa. More info regarding the custom metric API can be found here.
This is how we can use the the Application Autoscaler service to perform both dynamic and scheduled scaling. Autoscaling not only helps in managing workload but also supports efficient usage of resources which is a big factor in cloud environment. Performing manual scaling can be pain as load has to monitored manually as well. So, it’s always the best way to go about scaling application instances.
References
- Application Autoscaler Official Documentation: https://help.sap.com/viewer/7472b7d13d5d4862b2b06a730a2df086/Cloud/en-US/4ad999a0be664160a08514ba4ce6430c.html
- Repository: https://github.wdf.sap.corp/I353397/Cloud-Foundry-Research/tree/master/RandomWords
- JMeter User Manual: https://jmeter.apache.org/usermanual/build-web-test-plan.html
awesome and very helpful blog post, thanks!
Dear Bijoyan,
thank you for this helpful post. Unfortunately the referenced repository is pointing to an SAP internal resource. Is there any external available repository? What I'm looking for is some information how I can include the autoscaler configuration of my app in the mta.yaml. Hope you have a pointer for me. In https://github.com/SAP-samples/cf-mta-examples I haven't found anything.
Best regards
Gregor
Hi Greogor,
Did you have any luck finding an example of how to add the autoscaler configuration on the mta.yaml?
Thank you!
No, would hope that Bijoyan Das could give us some guidance.
Hi Gregor,
in case you are still interested you can either directly specify the scaling policy in YAML format in the mta.yaml, instead of the last line in https://github.com/SAP-samples/cf-mta-examples/blob/e222cebb1a700a7dd753d9d33e1dd17c519e648e/create-managed-services/mta.yaml#L23 you would start with instance_min_count: 1 or similar
or
you can provide the JSON file containing the scaling policy with your MTA and reference it with a path: parameter instead of the config: line.
These MTA features are not specific to the Application Autoscaler and you will find them documented in the MTA documentation.
Best regards,
Silvestre
Hi Silvestre Zabala,
thank you for the tip. I've added the required configuration with the commit 7d47c6e0d8e7353d961b7399b06b312380aa2708 to my bookshop-demo project. The important step is to add the config to the service binding of the srv module in the mta.yaml.
One issue that I've discovered when using the Application Autoscaler Dashboard is that the link to the documentation in the "Useful Links" section is pointing to a non existing page and not to https://help.sap.com/viewer/product/Application_Autoscaler/Cloud/en-US. Maybe you can trigger a correction for this issue.
Best regards
Gregor
Hi Gregor Wolf,
great!
Thank you for reporting the broken link, the fix will be part of the next release which should be deployed end of August.
Best regards,
Silvestre
Hi Bijoyan,
Thanks for the useful blog!!.
I have implemented application autoscaler in my application. below is my scaling policy.
When I add load to the instance, it is creating multiple instances (when memory >200).
But what is observed is that, if I don't use the instance at all ( no call made to the instance from any where) , 1 instance gets deleted and in the next 2 minute, again new instance gets created. so in each 2 minutes new instances gets deleted and created.
may I know why is this happening if the instance is not in use? If i restart the instance, it will reset the instance number to 1 and wont create new instance until we call the instance and add load.