Since the new Azure Cosmos DB SDK for .NET is available, I thought I'd look into it and see what's changed. You can check the repository on GitHub, it has a bunch of samples to help you get started.

Here I go through most of the basic operations that you'd normally do when interacting with Cosmos DB.

Creating databases and collections

First we will need a database and a couple collections in our Cosmos DB. Here is how we create them with the new SDK:

var client = new CosmosClient(Endpoint, Authkey);
await client.Databases.CreateDatabaseIfNotExistsAsync("DatabaseId", 400);
await client.Databases["DatabaseId"].Containers.CreateContainerIfNotExistsAsync("ContainerA", "/partitionKey");
await client.Databases["DatabaseId"].Containers.CreateContainerIfNotExistsAsync("ContainerB", "/partitionKey");

DocumentClient has been replaced by CosmosClient. It can take a connection string, but here I used the endpoint URL and a key.

Creating databases and collections is very streamlined now in my opinion. Here I created a database with shared throughput and two collections in it. You can also create collections with dedicated throughput. I used CreateDatabaseIfNotExistsAsync and CreateContainerIfNotExistsAsync so that the database and containers are created if they don't exist already. Normally you'd run code like this on app startup.

To get metadata on an existing database/collection we can do this:

await client.Databases["DatabaseId"].ReadAsync();
await client.Databases["DatabaseId"].Containers["ContainerA"].ReadAsync();

So initializing collections is pretty easy. But querying is the most important part.

Even though I had a chance to try the v3 SDK before, I could not initially find the query function on the client object. Is it on the CosmosClient? How about under the container reference gotten through the indexer? None that I could see.

So I took a look at the samples in the repo: https://github.com/Azure/azure-cosmos-dotnet-v3. It turns out we need a CosmosContainer object. This is the right way:

CosmosContainer container = client.Databases["DatabaseId"].Containers["ContainerA"];
// Run queries on container

Creating entities

Then I tried to create an entity in a collection:

var item = new TestEntity
{
    Name = "Test",
    PartitionKey = "a"
};
CosmosItemResponse<TestEntity> res = await container.Items.CreateItemAsync(item.PartitionKey, item);

It seems reasonable, but I ran into an error.

CosmosException: Response status code does not indicate success: 400 Substatus: 0 Reason: (Message: {"Errors":["The input name 'null' is invalid. Ensure to provide a unique non-empty string less than '255' characters."]}

Umm, okay? That error message is trying to say that I forgot to define an id for the document. My TestEntity class has an Id property, but it was null. I recalled the previous SDK had an id generation feature so I decided to try leaving it out. This works:

var item = new TestEntity
{
    Id = Guid.NewGuid().ToString(),
    Name = "Test",
    PartitionKey = "a"
};
CosmosItemResponse<TestEntity> res = await container.Items.CreateItemAsync(item.PartitionKey, item);

Note also that the partition key is very first class in this API. If your collection does not have partitions, you can leave it as null. Currently there is no overload without it. Oh and also, you don't need to specify a URI for the collection/entity in every query. This is quite nice.

This response type also has an implicit type conversion so you can do this:

CosmosItemResponse<TestEntity> res = await container.Items.CreateItemAsync(item.PartitionKey, item);
TestEntity createdEntity = res;
// OR
TestEntity createdEntity = await container.Items.CreateItemAsync(item.PartitionKey, item);

So you can easily inspect the RU usage from CosmosItemResponse<T> and then easily convert it to the entity type.

Reading an entity with its id

Okay so we created an entity, how can we read it back?

CosmosItemResponse<TestEntity> res = await container.Items.ReadItemAsync<TestEntity>(createdEntity.PartitionKey, createdEntity.Id);

This was pretty natural after knowing how entities (or items) are created. Did you notice they are called "items" now? I also really like having the partition key here as a parameter. With the previous SDK you always have to remember to include RequestOptions or something and specify the PartitionKey property on that. Here it is very hard to forget.

Reading many items from container

Okay, the big one. Querying for many items. I took a look at what kind of functions the Items property on the container has. Here are the query functions that are available:

  • CreateItemQuery
  • CreateItemQueryAsStream
  • GetItemIterator
  • GetItemStreamIterator

Why so many? Okay, I assume they do different things.

CreateItemQuery has this in the docs:

This method creates a query for items under a container in an Azure Cosmos database using a SQL statement with parameterized values. It returns a CosmosResultSetIterator.

Okay so you give it an SQL query and some parameters and it gives you an iterator. Let's try that.

var iterator = container.Items.CreateItemQuery<TestEntity>("SELECT * FROM ContainerA", "a");
while (iterator.HasMoreResults)
{
    CosmosQueryResponse<TestEntity> results = await iterator.FetchNextSetAsync();
    foreach (TestEntity result in results)
    {
        // Handling code here
    }
}

That actually wasn't too hard. It is very straightforward. One thing I am disappointed about though, the RU usage was not available on the response object! We are measuring these things very actively and like to know if we start using up too much in some operation.

So how about CreateItemQueryAsStream?

This method creates a query for items under a container in an Azure Cosmos database using a SQL statement with parameterized values. It returns a CosmosResultSetStreamIterator.

So the iterator type is different. Let's try that:

var streamIterator = container.Items.CreateItemQueryAsStream("SELECT * FROM ContainerA", "a");
while (streamIterator.HasMoreResults)
{
    var results = await streamIterator.FetchNextSetAsync();
    Stream stream = results.Content;
    using (var reader = new StreamReader(stream))
    {
        string data = await reader.ReadToEndAsync();
        // {"_rid":"f-puAPcVqDA=","Documents":[{"id":"9a839e5b-...
    }
}

So that's what those stream functions are! It allows you to read large entities in blocks and use memory efficiently. This is something super useful for apps with very large entities! Of course you'll need a JSON parser that supports streams.

Let's try GetItemIterator.

var itemIterator = container.Items.GetItemIterator<TestEntity>();
while (itemIterator.HasMoreResults)
{
    CosmosQueryResponse<TestEntity> results = await itemIterator.FetchNextSetAsync();
    foreach (TestEntity result in results)
    {
        // Result can be from any partition
        // Handling code here
    }
}

It does a cross-partition query! Frankly I was a bit surprised by that. I put some items with different partition keys to test cross-partition queries later, but I got them as a response here.

You can probably guess what GetItemStreamIterator does then. It's the same thing but gives you a Stream instead of the entity objects.

var itemStreamIterator = container.Items.GetItemStreamIterator();
while (itemStreamIterator.HasMoreResults)
{
    var results = await itemStreamIterator.FetchNextSetAsync();
    Stream stream = results.Content;
    using (var reader = new StreamReader(stream))
    {
        string data = await reader.ReadToEndAsync();
        // Contains entities from all partitions
        // {"_rid":"f-puAPcVqDA=","Documents":[{"id":"9a839e5b-...
    }
}

Cross-partition query with Streams, cool :)

As a recap:

  • CreateItemQuery: Query a partition with SQL
  • CreateItemQueryAsStream: Query a partition with SQL, get Streams
  • GetItemIterator: Iterate all entities across partitions
  • GetItemStreamIterator: Iterate all entities across partitions as Streams

A couple questions might arise:

  • Where is IQueryable/LINQ?
  • How can I make a cross-partition SQL query?

The first I can answer. An IQueryable API does not exist at the moment in the new SDK. I assume they will be forced to implement a query provider due to demand. I already feel sorry for the team, having had to parse Expressions before :/

A cross-partition SQL query can be done with a different overload of CreateItemQuery:

var crossIterator = container.Items.CreateItemQuery<TestEntity>("SELECT * FROM ContainerA", maxConcurrency: 2);
while (crossIterator.HasMoreResults)
{
    CosmosQueryResponse<TestEntity> results = await crossIterator.FetchNextSetAsync();
    foreach (TestEntity result in results)
    {
        // Result can be from any partition
        // Handling code here
    }
}

You do have to set the maximum degree of concurrency, or as it is defined for FeedOptions.MaxDegreeOfParallelism in the docs:

Gets or sets the number of concurrent operations run client side during parallel query execution in the Azure Cosmos DB service. A positive property value limits the number of concurrent operations to the set value. If it is set to less than 0, the system automatically decides the number of concurrent operations to run.

Parameterized queries

How about if we want to add a filter to the query? Well, we can add parameters to queries somehow right? After checking the CosmosSqlQueryDefinition class, I noticed it has a UseParameter function. This is what I then came up with:

var query = new CosmosSqlQueryDefinition("SELECT * FROM ContainerA c WHERE c.name = @name")
    .UseParameter("@name", "Test 2");
var paramIterator = container.Items.CreateItemQuery<TestEntity>(query, "a");
while (paramIterator.HasMoreResults)
{
    CosmosQueryResponse<TestEntity> results = await paramIterator.FetchNextSetAsync();
    foreach (TestEntity result in results)
    {
        // Handling code here
    }
}

It's reasonably easy to use in my opinion.

Updating entities

Okay so we created and queried entities, how about updating them? As in the older version, we either replace or upsert entities. Replace replaces the entity to with what you send, and upsert is the same except it creates the entity if it does not exist. For more details on what upsert is, you can refer this older blog post: https://azure.microsoft.com/en-us/blog/documentdb-adds-upsert/.

TestEntity item = await container.Items.ReadItemAsync<TestEntity>(partitionKey, id);
item.Name = "Test 3";
var replaceRes = await container.Items.ReplaceItemAsync(item.PartitionKey, item.Id, item);
TestEntity updatedEntity = replaceRes;

The main difference between this and UpsertItemAsync is that the upsert does not require you to provide an id.

Deleting entities

Finally, let's delete an entity.

var deleteRes = await container.Items.DeleteItemAsync<TestEntity>(partitionKey, id);
// deletedEntity is null
TestEntity deletedEntity = deleteRes;

This API is a bit weird. It requires you to provide an entity type as the generic type parameter. But the response object does not contain anything, you can't get the entity anyway so what's the point of specifying the entity type?

But other than that, specify partition key + id, and the entity is deleted. Again, if your collection is not partitioned, you can specify the partition key as null.

Summary

Overall, the new Cosmos DB SDK is easy to use and fairly straightforward. The Stream APIs are a great addition for apps with large entities. Personally I don't mind the lack of an IQueryable API, but I know people are going to demand it. Dependending on your use case having those APIs can be a great thing of course, since building filters dynamically also kind of sucks. My problem with IQueryable APIs in general is that you can never really know if the query will work before running it. And some query providers might even defer some filters client-side automatically without any errors or log entries. Which makes for very poor performance.

Feel free to leave comments if you want to discuss further, spot errors in my examples, or have some questions still.

Links