You may have used or heard of .NET Fiddle. It's an app that can run C# / F# / VB .NET code and return you the console output. I was wondering how they sandbox the code in their back-end so that a malicious user can't just upload a program that downloads malware and executes it on their server. And so I decided to try building something a bit similar, using Durable Functions to orchestrate the process and Container Instances (ACI) to run the code. For now it would only support C# on .NET 6.

You can find the full source code on GitHub.

Code sandbox at a high level

The process we want to create looks something like this:

  1. HTTP POST request arrives in an HTTP triggered function (C# file in body)
  2. The function uploads the file to Azure Storage, generates a SAS token
  3. Durable Function orchestration is started
  4. Container Instance started with the file SAS URL as input
  5. Container script runs, file downloaded as Program.cs next to a csproj file already in the container
  6. Container runs dotnet build and dotnet run
  7. Orchestrator monitors the progress and waits for the container to terminate
  8. Logs from the container are downloaded after the container has terminated or the process has taken too long
  9. The Container Instance is deleted and the C# file in Storage is deleted
  10. The logs are set as the orchestrator result

While the orchestrator is running, the app that started the process can check for its status and wait for it to finish. Once the process is finished, the app can get the output logs from the orchestrator result.

Since the user-provided code is executed in a Container Instance, it cannot affect other users. The instance is erased after each execution, making permanent modifications to the server impossible. There is also a timeout, preventing infinitely looping programs from taking resources.

Container image

To run code in Container Instances, we'll need a container image. We define one using a Dockerfile:

FROM mcr.microsoft.com/dotnet/sdk:6.0-alpine

WORKDIR /home
COPY ConsoleApp1.csproj .
COPY run.sh .

The container has the .NET 6.0 SDK ready to go. We add a csproj file so that we can use dotnet run:

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net6.0</TargetFramework>
  </PropertyGroup>
</Project>

The run.sh script mentioned in the Dockerfile is responsible for downloading the user-provided code and executing it:

#!/bin/sh
wget $1 -O ./Program.cs

dotnet build

dotnet run

We will use this script as the command to run when the container is started, and give it a URL with a SAS token as an argument.

You can refer to the Azure Container Registry docs for guidance on how to build and publish an image.

Azure resources

For the sandbox we only need a couple resources:

  • Container registry (to hold the container image we use to run code)
  • Container Instances (to run the code, created when needed by orchestrator)
  • Storage account (we upload the code to run here)
  • Function App (runs the orchestrator)

You can also use Azurite/Azure Storage Emulator and run the orchestrator locally, so only the container registry is really needed. You could use a different registry of course as well, as long as Container Instances support it. I used a Basic SKU registry for testing. Container Instances will be created by the orchestrator as needed.

Starting the orchestrator

We can first look at the HTTP triggered function that starts the orchestration:

[FunctionName(HttpStarterName)]
public async Task<HttpResponseMessage> HttpStart(
    [HttpTrigger(AuthorizationLevel.Function, "post")] HttpRequestMessage req,
    [DurableClient] IDurableOrchestrationClient starter,
    ILogger log)
{
    await using var stream = await req.Content.ReadAsStreamAsync();
    var (containerName, blobName, sasUrl) = await _blobStorageClient.Upload(stream);

    string instanceId = await starter.StartNewAsync(OrchestratorName, new OrchestratorInput
    {
        SasUrl = sasUrl,
        ContainerName = containerName,
        BlobName = blobName
    });

    log.LogInformation($"Started orchestration with ID = '{instanceId}'.");
    return starter.CreateCheckStatusResponse(req, instanceId);
}

It uploads the file from the request body to Storage and then starts the orchestration. The blob upload is done by BlobStorageClient:

public async Task<(string containerName, string blobName, string sasUrl)> Upload(
    Stream stream)
{
    var blobName = $"{Guid.NewGuid()}.cs";
    var container = _blobServiceClient.GetBlobContainerClient(_containerName);
    var blob = container.GetBlobClient(blobName);
    await blob.UploadAsync(stream);

    var sasUrl = blob.GenerateSasUri(BlobSasPermissions.Read, DateTimeOffset.UtcNow.AddMinutes(30));

    return (_containerName, blobName, sasUrl.ToString());
}

The SAS token is valid for the next 30 minutes; the container will use that to download the file.

Starting the container

You can see the orchestrator code in the GitHub repo. The first thing it does is start the container using the blob URL containing a SAS token. The container name is auto-generated, as is the container group name. Container Instances get hypervisor isolation per container group. Here I've implemented the necessary HTTP request using anonymous types:

public async Task StartContainer(
    string containerGroupName,
    string containerName,
    string sasUrl)
{
    // E.g. https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroup}/providers/Microsoft.ContainerInstance/containerGroups/{containerGroupName}?api-version=2019-12-01
    var url = GetUrl(containerGroupName);
    var json = JsonConvert.SerializeObject(new
    {
        location = _location,
        properties = new
        {
            containers = new[]
            {
                new
                {
                    name = containerName,
                    properties = new
                    {
                        image = _imageName,
                        resources = new
                        {
                            requests = new
                            {
                                cpu = 1,
                                memoryInGB = 1.5
                            }
                        },
                        environmentVariables = new[]
                        {
                            new
                            {
                                name = "BLOBURI",
                                secureValue = sasUrl
                            }
                        },
                        command = new[]
                        {
                            "/bin/sh", "-c", "/home/run.sh $BLOBURI"
                        }
                    }
                }
            },
            restartPolicy = "Never",
            osType = "Linux",
            imageRegistryCredentials = new[]
            {
                new
                {
                    server = _imageRegistryServer,
                    username = _imageRegistryUsername,
                    password = _imageRegistryPassword
                }
            }
        }
    });
    var request = new HttpRequestMessage(HttpMethod.Put, url)
    {
        Content = new StringContent(json, Encoding.UTF8, "application/json")
    };
    await AuthenticateRequest(request);
    var response = await _httpClient.SendAsync(request);

    if (response.StatusCode != HttpStatusCode.Created)
    {
        throw new Exception("Something went wrong creating container instance: " + (int)response.StatusCode);
    }
}

This part is especially important:

command = new[]
{
    "/bin/sh", "-c", "/home/run.sh $BLOBURI"
}

This is what tells the container what it should do on startup. It runs the shell script contained in the image with the blob URL as a parameter. I setup the blob URL in an environment variable called BLOBURI and are using it here.

Waiting for completion

To know when the container has completed execution, we need to check its status on an interval. I used a function like this to get the current status:

public async Task<ContainerStatus> GetContainerStatus(
    string containerGroupName)
{
    // E.g. https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroup}/providers/Microsoft.ContainerInstance/containerGroups/{containerGroupName}?api-version=2019-12-01
    var url = GetUrl(containerGroupName);
    var request = new HttpRequestMessage(HttpMethod.Get, url);
    await AuthenticateRequest(request);

    var response = await _httpClient.SendAsync(request);
    var responseContent = await response.Content.ReadAsStringAsync();

    if (response.StatusCode != HttpStatusCode.OK)
    {
        throw new Exception($"Unexpected response from status: {(int)response.StatusCode} {responseContent}");
    }

    var containerGroup = (JObject)JsonConvert.DeserializeObject(responseContent);
    var properties = (JObject)containerGroup.GetValue("properties");
    var instanceView = (JObject)properties.GetValue("instanceView");
    var state = instanceView.GetValue("state")?.Value<string>() ?? "Pending";

    return state switch
    {
        "Pending" => ContainerStatus.Running,
        "Running" => ContainerStatus.Running,
        "Succeeded" => ContainerStatus.Succeeded,
        _ => ContainerStatus.Failed
    };
}

When the container is starting, its status will be Pending. The status will change to Running while the container executes the code. Then once the dotnet run command has finished, the status will change to Succeeded. I made it so that it would check the status maximum 20 times with 5 seconds between checks. This gives the container around 95 seconds of time to startup and run the code. If the container fails to get to succeeded status in that time, we give up, get the logs, and delete the container.

As it turned out, it actually takes a while to startup the container. It takes around a minute to start and run the code.

Getting the logs

We want to get the container logs in order to show what the output was to the user. We can do that with another HTTP request:

public async Task<string> GetContainerLogs(
    string containerGroupName,
    string containerName)
{
    // E.g. https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroup}/providers/Microsoft.ContainerInstance/containerGroups/{containerGroupName}/containers/{containerName}/logs?api-version=2019-12-01
    var url = GetUrl(containerGroupName, containerName);
    var request = new HttpRequestMessage(HttpMethod.Get, url);
    await AuthenticateRequest(request);

    var response = await _httpClient.SendAsync(request);
    var responseContent = await response.Content.ReadAsStringAsync();
    var logsResults = (JObject)JsonConvert.DeserializeObject(responseContent);
    return logsResults.GetValue("content").Value<string>();
}

Here are some example logs from a run:

Connecting to redacted.blob.core.windows.net

saving to './Program.cs'

Program.cs 100% |********************************| 181 0:00:00 ETA

'./Program.cs' saved

Microsoft (R) Build Engine version 16.9.0-preview-21103-02+198f3f262 for .NET

Copyright (C) Microsoft Corporation. All rights reserved.

Determining projects to restore...

Restored /home/ConsoleApp1.csproj (in 133 ms).

You are using a preview version of .NET. See: https://aka.ms/dotnet-core-preview

ConsoleApp1 -> /home/bin/Debug/net6.0/ConsoleApp1.dll

Build succeeded.

0 Warning(s)

0 Error(s)

Time Elapsed 00:00:04.35

Hello World!

At the end you can see the actual output from the Hello World app.

Summary

The sandbox I made isn't quite as fast as .NET Fiddle! It takes around 1 minute to run a Hello World application. The main issue is the start-up time of Container Instances. But I feel like it achieves its purpose in completely isolating untrusted code. It is also not a very expensive solution as the orchestrator function and Container Instances only run when needed.

If you have any questions or comments, feel free to leave a comment below!

Links