You may have used Durable Functions to implement various kinds of workflows or long running tasks, but have you considered what they do under the hood? I was curious about those details and thought I'd share notes from going through the Durable Functions and DurableTask repositories. To be clear we will be focusing on the Azure Storage implementation of DurableTask which is used by default with Durable Functions.
The Functions host will discover available functions during startup. If we generate a Durable Function in Visual Studio, we get an orchestrator function like this:
public static async Task<List<string>> RunOrchestrator(
[OrchestrationTrigger] IDurableOrchestrationContext context)
Durable Functions provides a "binding provider" for the
There are also similar classes for activity and entity triggers.
The Functions host will call
TryCreateAsync on this class to create a binding object.
In this case a
This binding has a
CreateListenerAsync method that gets called as well and returns
Tomasz Pęczek has a pretty nice article on how Functions extensions work if you are interested.
Starting the listener
- Create things needed by durability provider (in this case Storage blobs, tables and queues)
- Start listening for orchestration/activity messages that will trigger Function executions
Here is a (hopefully helpful) sequence diagram of what happens:
The only thing
StartAsync itself does is call
This method uses a boolean flag combined with an
AsyncLock to ensure it only runs once.
CreateIfNotExistsAsync is called on the default durability provider
and the task hub worker is started.
The durability provider class calls into
This one service is the core of the DurableTask implementation on Azure Storage.
It has around 2000 lines of code though,
maybe it could use some refactoring!
There are also implementations for other providers in the DurableTask
repository: Service Fabric, Redis, Service Bus, and SQL Server.
There is also an "Emulator" implementation that is an in-process implementation
for testing purposes.
CreateIfNotExistsAsync just calls
to create the "task hub" if it does not already exist.
Ensuring the task hub exists
method is designed in such a way that it only runs the initialization once;
though it can be reset if the provider runs into connection issues etc.
You can find the implementation in
The "task hub" refers to blob containers, tables and queues needed to run orchestrations.
It will have a default name "TestHubName" when running locally.
I usually change the name for each project through host.json so they use their own queues
and won't get mixed up with other projects.
EnsureTaskHubAsync first runs, it will create (replace "taskhub" with the name of your task hub):
- Blob container taskhub-applease
- Blob taskhub-appleaseinfo in above container
- Blob container taskhub-leases
- Tables taskhubHistory and taskhubInstances
- Work item queue taskhub-workitems
- Control queues (e.g. taskhub-control-partitionnum), one for each partition
- Blobs for each control queue in taskhub-leases container
The app lease blob is utilized by the
class to "ensure a single app's partition manager is started at a time".
The lease blobs in the leases container are used by DurableTask to figure out which instance has control of which partition currently. So if you have for example two instances in Azure Functions and there are four partitions, both of them could have control of two partitions to split the work.
We will discuss leases and how DurableTask deals with partition ownership in a later part in more detail, so please stay tuned.
The "tracking store" is what DurableTask uses to store information about the orchestrations and their state. In the Azure Storage implementation this is implemented with the history and instances tables. The status of each orchestration is stored in the instances table. The history table is where all events are stored; it is used by Durable Functions for orchestrator replay.
Control queues and the work item queue are discussed in a lot of detail in my previous article on how Durable Functions scale. They are used by the Azure Storage implementation to trigger orchestrator and activity functions. Each of the control queues gets two lease blobs to control which instance is reading messages from the queue.
Starting the task hub worker
Now that the durability provider has finished creating the things
it needs to run orchestrations,
we can move on to
This creates a
These classes seem to be responsible for getting new
messages and triggering execution for them.
AzureStorageOrchestrationService.CreateIfNotExistsAsync gets called again,
though this time it won't do anything since the things in Azure Storage were already created earlier.
only disables Nagle's algorithm
for the history and instances table URIs.
There is an older blog article
mentioned in the comments of DurableTask
that says disabling Nagle's algorithm can greatly
improve throughput for table inserts and updates in Azure Storage.
The last thing that the orchestration service
does is call
We will discuss leases in more detail in a future part,
but essentially this will start attempting to acquire the app lease.
If it is acquired, the partition manager is started.
The resources required by Durable Functions/DurableTask now exist and the listener has been started. When a message is received in a control queue, an orchestration will be started. But let us discuss that in more detail the next time in part 2.
When a Functions app with Durable Functions starts up, two operations are done at a high level:
- Create required things in Azure Storage (blobs, tables, queues)
- Start listening for messages
There was quite a bit of code to go through in the two repositories, but the flow here is quite straightforward to follow.
Next time we will look at what happens when an orchestration is started:
public static async Task<HttpResponseMessage> HttpStart(
[HttpTrigger(AuthorizationLevel.Anonymous, "get", "post")] HttpRequestMessage req,
[DurableClient] IDurableClient starter)
string instanceId = await starter.StartNewAsync("Function1", null);