Managing and throttling serverless scaling with Azure Functions

Azure Functions’ “serverless” promise of abstracting away underlying infrastructure can be very compelling, particularly as it comes with a promise of automated scale and no idle capacity.

However, functions do not scale in isolation as you need to ensure that all your supporting services can support the same level of scale, particularly your databases. These services generally have limits on throughput that can be quickly flooded by runaway scaling-out of Azure functions.

When you use a consumption-based pricing plan with Azure Functions, the runtime adds new host instances in response to the workload, e.g. the rate of HTTP requests or number of messages in a queue. You don't have any control over when these hosts are created, though in theory you shouldn't really care. After all, you're being charged for the number of function executions rather than host creation.

This can start to get painful if you are thinking in terms of thousands of transactions per second. For example, you can scale Azure CosmosDB up to 50k request units per second without contacting technical support. This translates to around 5,000 inserts per second for a JSON document with a dozen properties across all your partitions, with a limit of 1,000 inserts for any single partition. A Functions application that has scaled out to more than a hundred concurrent hosts can burn through this very easily, bringing your application down with “Request Rate Too Large“errors.

Buffering with queues

You would expect to be able to buffer a large workload by splitting it into tasks that sit on a queue, either using Azure Queues or Azure Service Bus. The hosts should be able to gradually work through the tasks at a sustainable pace by pulling tasks of a queue when they are ready.

The problem with this is that the Function runtime’s scale controller will spin up new host instances in response to the size of a queue. If your functions are taking a while to burn through a large queue then the runtime will continue to spin up new hosts. This gradually increases the rate of processing and undermines any buffering provided by the queue.

This behaviour is difficult to control effectively. For Service Bus you can use the host.json configuration to control the number of concurrent messages processed by each host (the maxConcurrentCalls setting). However, this has little effect on a consumption plan as the run-time will continue to spin up new hosts in response to the size of the queue.

Under some circumstances you can spread out a workload by setting the time on which individual messages can be picked up. This is done via the ScheduleMessageAsync method on the Service Bus client. This can help to stagger message processing, though it takes a little trial and error to get right and is less effective for long-running functions.

Using an app service plan with automated scale out

The only reliable way to cap the addition of new hosts is to use an app service plan. This gives you direct control over the number of hosts that you can create along with the rules for spinning them up and down (e.g. CPU usage or queue size).

The catch here is that you're paying for VMs rather than function invocations, so you still incur costs when the application is idle. This does rather undermine the point of using Azure Functions in the first place, as you are scaling up VM instances rather than function invocations.

Using host settings

Azure Functions does not directly support the concurrency execution control that is available in AWS Lambda. This lets you explicitly set the limits on concurrent execution for each individual function which can protect any downstream services from a surge in processing.

The only control you have over host creation in Azure Functions an obscure application setting: WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT. This implies that you can control the number of hosts that are generated, though Microsoft claim that “it's not completely foolproof” and “is not fully supported”.

They're not kidding. From my own experience it only throttles host creation effectively if you set the value to something pretty low, i.e. less than 50. At larger values then its impact is pretty limited. It's been implied that this feature will be will be worked on in the future, but the corresponding issue has been open in GitHub with no update since July 2017.

Using durable functions

A relatively new addition to Azure Functions is the Durable Functions extension, which allows you to build long-running workflows in a serverless environment. It introduces orchestrator functions which are long-running singletons that can call other functions and remember state. This is all handled via a system of control queues and table storage maintained by the runtime host.

Critically, you can run these functions for longer than the ten-minute limit placed by the consumption plan run-time. This allows orchestrator functions to implement a “fan-out” pattern for larger workloads where a single orchestrator function can call any number of stateless worker functions and collate the results.

A simple example is shown below. The orchestrator function calls a batch of ten functions called “WorkTask”, waits for all the asynchronous tasks to complete and compiles the results at the end:

[FunctionName("FanOut")]
public static async Task RunAsync([OrchestrationTrigger] DurableOrchestrationContext context, TraceWriter log)
{
    // Kick off a set of tasks to run in parakkek
    var tasks = new List<Task<string>>();
 
    for (int n = 0; n < 10; n++)
    {
        Task<string> task = context.CallActivityAsync<string>("WorkTask", n);
        tasks.Add(task);
    }
 
    // Wait for all the tasks to complete
    await Task.WhenAll(tasks);
 
    // Collate the results
    IEnumerable<string> results = tasks.Select(t => t.Result);
 
    // Do something with the results, e.g. send to queue
    log.Info(string.Join(", ", results));
}

The “WorkTask” is tagged as an activity function – these are stateless functions that are scaled out across multiple host instances:

[FunctionName("WorkTask")]
public static async Task<string> Run([ActivityTrigger] DurableActivityContext context, TraceWriter log)
{
    // Get the input value sent by the orchestrator
    var id = context.GetInput<string>();
 
    // Do some work that takes a while
    Thread.Sleep(TimeSpan.FromSeconds(1));
 
    // Return a result
    return await Task.FromResult<string>($"{id}: {Guid.NewGuid().ToString()}");
}

Recognise that you have the wrong solution…?

Durable functions are very promising, but are very much a work in progress. The NuGet package is only available as a pre-release beta so is beset by the usual problems of clashing dependencies, patchy documentation and a slightly flaky implementation. It's also down to you to write the careful thread management code that would be required to run a robust long-running process.

The problem here is that serverless platforms like Azure Functions are not necessarily designed for managing controlled, large-scale throughput. For large batch processing jobs you are probably better off with a solution that allows you to burn down a workload at a steady pace. If you still want the benefit of only paying for what you use then a container-based solution such as Azure Container Instances might be a better choice as this gives you full control over how processing nodes are created, scaled and destroyed.