Management operations have been possible to do with Azure AD authentication for some time now. So you could for example create databases and containers in a Cosmos DB account with the right Azure RBAC roles assigned to your identity. Accessing data has however still required the use of access keys/resource tokens. Until now.

Data plane RBAC authorization is now available in preview, which means we can get rid of the access keys and query the database using Azure AD authentication. That also allows us to use Managed Identities! By utilizing Managed Identities, we do not need to store or manage any keys.

This feature has a few limitations currently (this is a preview after all):

  • Only SQL API is supported
  • There is no Azure Portal support
  • Only .NET V3 SDK and Java V4 SDK are supported
  • Access keys cannot be completely disabled

Some of these restrictions may be lifted as the preview progresses.

Defining a role

At the moment at least, we must define custom roles for data authorization. There are no built-in roles. For the sake of an example, let's define a role that allows read/write access to any database within the Cosmos DB account.

For this we will use a Bicep template. The documentation has samples for Azure PowerShell and Az CLI.

First we need some variables:

var cosmosAccountName = 'zureaadtesting'
var readWriteRoleDefinitionId = guid(cosmosAccountName, 'ReadWriteRole')

The second variable is the unique id for the role. It will get the same id as long as the Cosmos DB account name is the same.

Then we also need the Cosmos DB account:

resource cosmosAccount 'Microsoft.DocumentDB/databaseAccounts@2019-12-12' = {
  name: cosmosAccountName
  location: resourceGroup().location
  kind: 'GlobalDocumentDB'
  properties: {
    consistencyPolicy: {
      defaultConsistencyLevel: 'Session'
    }
    locations: [
      {
        locationName: resourceGroup().location
        failoverPriority: 0
      }
    ]
    databaseAccountOfferType: 'Standard'
  }
}

And then we can define our role:

resource cosmosReadWriteRoleDefinition 'Microsoft.DocumentDB/databaseAccounts/sqlRoleDefinitions@2021-03-01-preview' = {
  name: '${cosmosAccount.name}/${readWriteRoleDefinitionId}'
  properties: {
    assignableScopes: [
      cosmosAccount.id
    ]
    permissions: [
      {
        dataActions: [
          'Microsoft.DocumentDB/databaseAccounts/readMetadata'
          'Microsoft.DocumentDB/databaseAccounts/sqlDatabases/containers/items/*'
          'Microsoft.DocumentDB/databaseAccounts/sqlDatabases/containers/*'
        ]
        notDataActions: []
      }
    ]
    roleName: 'Reader Writer'
    type: 'CustomRole'
  }
}

Note that Cosmos DB roles are defined with a model that is very similar to general Azure RBAC roles, but it is not defined in the Microsoft.Authorization resource provider. I'm not quite sure why they wanted to do this differently.

Let's analyze this role definition in a bit more detail. The name and resource type are defined like this:

  • Type: Microsoft.DocumentDB/databaseAccounts/sqlRoleDefinitions
  • Name: zureaadtesting/dd384270-41ad-4071-a387-d03b9779cae5

This results in the template using this resource id: Microsoft.DocumentDB/databaseAccounts/zureaadtesting/sqlRoleDefinitions/dd384270-41ad-4071-a387-d03b9779cae5. I've always found this to be a bit awkward in ARM/Bicep templates.

Next we define the assignable scopes:

assignableScopes: [
    cosmosAccount.id
]

This requires fully-qualified ids for where this role can be applied. Since we want to be able to apply it at the database account level, we put in the account id. If you were to put this value in manually, it would be something like /subscriptions/subscription-id/resourceGroups/rg-name/providers/Microsoft.DocumentDB/databaseAccounts/account-name.

If you want to limit the scope of the role to specific databases/containers, that can be done as well. Slightly confusingly it does not use the Bicep/ARM ids for databases/containers, but instead the database account id combined with their own id scheme. If we wanted to limit the role such that it can only be assigned to a specific database:

assignableScopes: [
    '${cosmosAccount.id}/dbs/DatabaseIdHere'
]

Or a specific container in a database:

assignableScopes: [
    '${cosmosAccount.id}/dbs/DatabaseNameHere/colls/ContainerNameHere'
]

To allow assignment of the role in multiple scopes, you specify multiple assignable scopes.

Next are the permissions granted by the role:

permissions: [
    {
        dataActions: [
            'Microsoft.DocumentDB/databaseAccounts/readMetadata'
            'Microsoft.DocumentDB/databaseAccounts/sqlDatabases/containers/items/*'
            'Microsoft.DocumentDB/databaseAccounts/sqlDatabases/containers/*'
        ]
        notDataActions: []
    }
]

You can refer to the documentation for a full list of possible permissions. The first metadata read permission is needed by the Cosmos SDK. The next two permissions grant all available permissions at container and item level. If this was a more limited role, we could allow only reads etc.

The notDataActions property defines the permissions that should be removed from the dataActions set. So for example if we wanted to allow all operations except deleting items, we could define the permissions as:

permissions: [
    {
        dataActions: [
            'Microsoft.DocumentDB/databaseAccounts/readMetadata'
            'Microsoft.DocumentDB/databaseAccounts/sqlDatabases/containers/items/*'
            'Microsoft.DocumentDB/databaseAccounts/sqlDatabases/containers/*'
        ]
        notDataActions: [
            'Microsoft.DocumentDB/databaseAccounts/sqlDatabases/containers/items/delete
        ]
    }
]

The roleName and type properties on the definition are pretty self-explanatory. The first defines the display name for the role and the second should be set to "CustomRole" since this is a custom role.

Assigning the role

Now that we have defined a read/write role, we need to assign it to users or apps for it to be useful. We can do this again through a Bicep template. The role assignment needs a GUID as well:

var readWriteRoleAppAssignmentId = guid(cosmosAccountName, 'ReadWriteRole', 'App')

The GUID will identify the assignment of the read/write role to this particular app. Now we can assign the role to a Managed Identity:

resource cosmosReadWriteAppAssignment 'Microsoft.DocumentDB/databaseAccounts/sqlRoleAssignments@2021-03-01-preview' = {
  name: '${cosmosAccount.name}/${readWriteRoleAppAssignmentId}'
  properties: {
    principalId: reference(app.id, '2016-08-01', 'Full').identity.principalId
    roleDefinitionId: cosmosReadWriteRoleDefinition.id
    scope: cosmosAccount.id
  }
}

Here the principalId needs to be the objectId of the user/group/application. Note that in the case of an application, you need to use the objectId of the Enterprise app/service principal. Because we deploy a Managed Identity as part of an App Service, we can reference it for the needed id.

The roleDefinitionId requires that you use the fully-qualified id for it. The GUID is not enough.

Scope then defines where the permissions from this role will be given. Note we defined assignable scopes earlier for the role. If you specify on the role that it can only be assigned to a specific database, then any attempts to assign the role to some other database will fail. Since we defined that this role is assignable for anything in this database account, we can assign it at the account level, or limit to some specific databases/containers.

If you assign multiple roles, note that you cannot deploy those in parallel. Trying to do that gives an error:

There is another user operation in progress which requires an exclusive lock on zureaadtesting. Please retry after sometime.

This can be worked around by specifying a dependency to another role assignment such that they deploy in series one by one. For example:

resource cosmosReadWriteAppAssignment2 'Microsoft.DocumentDB/databaseAccounts/sqlRoleAssignments@2021-03-01-preview' = {
  name: '${cosmosAccount.name}/${readWriteRoleAppAssignmentId2}'
  properties: {
    principalId: reference(app2.id, '2016-08-01', 'Full').identity.principalId
    roleDefinitionId: cosmosReadWriteRoleDefinition.id
    scope: cosmosAccount.id
  }
  dependsOn: [
    cosmosReadWriteAppAssignment
  ]
}

The next role assignment can then depend on this one and so on. Bit of a hacky solution that I wish was not necessary.

Usage through the Cosmos DB .NET SDK

Once the role assignments have been done, we can use the Azure.Identity package together with the Microsoft.Azure.Cosmos package to access the database. As an example, we can add them to an ASP.NET Core Razor Pages application:

<ItemGroup>
  <PackageReference Include="Azure.Identity" Version="1.4.0-beta.3" />
  <PackageReference Include="Microsoft.Azure.Cosmos" Version="3.17.0-preview" />
</ItemGroup>

To use the Cosmos DB client in our application, we will configure the credentials and add the client to the service collection as a singleton:

var localTenantId = Configuration["Auth:LocalTenantId"];
if (string.IsNullOrEmpty(localTenantId))
{
    localTenantId = null;
}
var credential = new DefaultAzureCredential(new DefaultAzureCredentialOptions
{
    SharedTokenCacheTenantId = localTenantId,
    VisualStudioCodeTenantId = localTenantId,
    VisualStudioTenantId = localTenantId
});

var cosmosClient = new CosmosClient(Configuration["Cosmos:AccountUri"], credential);
services.AddSingleton(cosmosClient);

The local tenant id setting is configured only in appsettings.Development.json here since it is not needed outside local development. Cosmos DB account URI, database id and container id are specified in appsettings.json since they are the same in this example.

We are then able to use the client in our Razor Page to query for items and create them:

public class IndexModel : PageModel
{
    private readonly CosmosClient _cosmosClient;
    private readonly string _databaseId;
    private readonly string _containerId;

    public IndexModel(CosmosClient cosmosClient, IConfiguration configuration)
    {
        _cosmosClient = cosmosClient;
        _databaseId = configuration["Cosmos:DatabaseId"];
        _containerId = configuration["Cosmos:ContainerId"];
    }

    public List<Item> Items { get; set; }

    public async Task OnGetAsync()
    {
        // Example to list items
        Items = new List<Item>();

        var container = GetContainer();
        var iterator = container.GetItemQueryIterator<Item>($"SELECT * FROM {_containerId} c");
        while (iterator.HasMoreResults)
        {
            var feedResponse = await iterator.ReadNextAsync();
            Items.AddRange(feedResponse);
        }
    }

    public async Task<IActionResult> OnPostAsync()
    {
        // Example to create an item
        var container = GetContainer();
        var item = new Item
        {
            Id = Guid.NewGuid().ToString(),
            PartitionKey = "a",
            Name = $"Created Item at {DateTime.UtcNow:t}"
        };
        await container.CreateItemAsync(item, new PartitionKey(item.PartitionKey));
        return RedirectToPage();
    }

    private Container GetContainer()
    {
        return _cosmosClient.GetContainer(_databaseId, _containerId);
    }
}

As you can see, the usage of the client is not different than when using an access key since the SDK handles authentication behind the scenes. With the client configured in this way, it will use the user account setup in Visual Studio when running locally, and the system-assigned Managed Identity while running the app in Azure. If you use the Cosmos DB Emulator locally, you would need to construct the client object differently and use the emulator access key.

Summary and thoughts

Defining the roles and assigning them through Bicep templates is quite doable. The issue with deploying role assignments to multiple apps/users at once is unforturnate, and I wish that it is fixed so workarounds are not required. Usage is quite easy once the role is assigned, at least with the .NET SDK.

A really great part of this is that you can give an application or user a restricted set of permissions. An app with one of the access keys can do anything, which isn't usually the intention. Auditing also becomes easier as Cosmos DB will log which user/app did the operation.

Note that your code will not be able to create databases etc. after changing out of access keys. The data plane does not allow you to do those management operations after all. It is possible to use Azure AD authentication for that separately as well, but I would recommend that you create the databases and containers through the Bicep/ARM template.

I would really like to see these roles surfaced in the Azure Portal to bring visibility to them. At the moment you need to use tools like Azure PowerShell to know what roles there are and who/what those roles are assigned to.

Overall, this is great progress for removing keys that you have to manage in Azure. Now if only Azure Search could do the same...