Durable Functions are durable because they keep their state in a data store, Azure Storage usually. This does mean that over time you might start having a lot of data there. Especially if those orchestrations are executed often and/or run many operations. Storing that data costs something, though Table Storage is pretty cheap.

In this article I'll show you some approaches for removing orchestration instance history in Durable Functions.

Targeted cleanup

The most performant way of cleaning up instance history is to target a specific orchestration instance. If you have stored the instance ids in a database, this can be a really good way. A single instance's history can be purged in two ways:

  1. Use the durable orchestration client object
  2. Use the admin API

The first is quite simple:

[FunctionName(nameof(CleanupOrchestration))]
public async Task<IActionResult> CleanupOrchestration(
    [HttpTrigger(AuthorizationLevel.Function, "post")] HttpRequest req,
    [DurableClient] IDurableOrchestrationClient orchestrationClient)
{
    var instanceId = req.Query["id"];
    var requestPurgeResult = await orchestrationClient.PurgeInstanceHistoryAsync(instanceId);
    return new OkResult();
}

You give the id of the instance you want to remove to PurgeInstanceHistoryAsync, and that's done. The result object tells you how many instances were deleted, so you can verify if it actually deleted something.

The other method involves the "admin API" of Durable Functions. When you start an orchestration and use the template method of returning a "check status response", you get a "purge history URL" in the returned JSON. You can make an HTTP DELETE call to that URL to delete the instance. See the docs for more info: https://docs.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-http-api#purge-single-instance-history.

A critical thing to note about these methods is that they do not purge sub-orchestrator history. You need to also store the sub-orchestrator instance ids if you want to clean them up.

In the sample electronic signing app, the instance ids for the main orchestrator and all of the sub-orchestrators are stored in the database. This allows us to write a cleanup function like this:

[FunctionName(nameof(CleanupSigningWorkflows))]
public async Task CleanupSigningWorkflows(
    [TimerTrigger("0 0 0 * * *")] TimerInfo timerInfo,
    [DurableClient] IDurableOrchestrationClient orchestrationClient,
    ILogger log)
{
    var signingRequests = await _db.Requests
        .Include(r => r.Signers)
        .Where(r => r.WorkflowCompletedAt != null
            && EF.Functions.DateDiffDay(r.WorkflowCompletedAt, DateTimeOffset.UtcNow) > 365)
        .ToListAsync();
    foreach (var signingRequest in signingRequests)
    {
        foreach (var signer in signingRequest.Signers)
        {
            var signerPurgeResult = await orchestrationClient.PurgeInstanceHistoryAsync(signer.WaitForSignatureInstanceId);
            log.LogInformation(
                "Purged instance history for signer {SignerId}, {InstancesDeleted} instances deleted",
                signer.Id, signerPurgeResult.InstancesDeleted);
        }

        var requestPurgeResult = await orchestrationClient.PurgeInstanceHistoryAsync(signingRequest.Workflow.Id);
        log.LogInformation(
            "Purged instance history for signing request {RequestId}, {InstancesDeleted} instances deleted",
            signingRequest.Id, requestPurgeResult.InstancesDeleted);
    }
}

This function runs at midnight every day. It finds signing requests from the database that were completed over a year ago and removes the instance history for the main orchestrator as well as the sub-orchestrators for each signer.

Time range-based cleanup

Instead of targeting a specific orchestration instance, we can also target a time range. This is less performant as we need to scan data in Table Storage now. The documentation mentions this for the API operation:

This operation can be very expensive in terms of Azure Storage I/O if there are a lot of rows in the Instances and/or History tables.

But it does work on any instance even if you haven't stored their ids. Again you can use either:

  1. The durable orchestration client object
  2. The admin API

To purge instances in some time range, we can again use the orchestration client's PurgeHistoryAsync function. This time we use the overload that accepts two DateTimes and a list of statuses:

[FunctionName(nameof(CleanupOldWorkflows))]
public async Task CleanupOldWorkflows(
    [TimerTrigger("0 0 0 * * *")] TimerInfo timerInfo,
    [DurableClient] IDurableOrchestrationClient orchestrationClient,
    ILogger log)
{
    var createdTimeFrom = DateTime.UtcNow.Subtract(TimeSpan.FromDays(365 + 30));
    var createdTimeTo = createdTimeFrom.AddDays(30);
    var runtimeStatus = new List<OrchestrationStatus>
    {
        OrchestrationStatus.Completed
    };
    var result = await orchestrationClient.PurgeInstanceHistoryAsync(createdTimeFrom, createdTimeTo, runtimeStatus);
    log.LogInformation("Scheduled cleanup done, {InstancesDeleted} instances deleted", result.InstancesDeleted);
}

Here we tell it to find all instances that are 365-395 days old. And in addition we only want to delete instances that are completed. You can add other statuses there to also delete e.g. failed instances. This function runs on a timer trigger and executes daily at midnight.

From outside the function, you can also use the admin API's Purge multiple instance histories action. It takes as parameters all the same things as the orchestration client.

Summary

There are multiple ways to clean up instance history in Durable Functions. Personally I think a scheduled function that looks up instances for clean up from your own database will work best most of the time, as you know what you are deleting. It also allows for better performance than the time range-based lookup. I recommend storing orchestration instance ids in your database to enable cleanup and queries later.

Links