Slide 1

Slide 1 text

Design and implementation of Cosmos DB Change Feed-centric architecture Kazuyuki Miyake Tatsuro Shibamura

Slide 2

Slide 2 text

Agenda 1. Change Feed-centric architecture Design & Strategy Kazuyuki Miyake (Microsoft MVP for Azure) @kazuyukimiyake 2. Change Feed-centric architecture Deep Dive Tatsuro Shibamura (Microsoft MVP for Azure) @shibayan

Slide 3

Slide 3 text

1. Cosmos DB Change Feed-centric architecture Design & Strategy

Slide 4

Slide 4 text

Massive data processing Needs and Challenges Balancing massive data writing and complex queries No performance degradation under Massive writing Can handle different types of queries Cost model to pay as you go 5

Slide 5

Slide 5 text

Limitations of traditional architectures Try to handle everything in one big datastore... Write-optimized datastore are weak on complex queries Query-optimized data stores are weak to massively concurrent writes -> As a result, rely on over-spec datastores 6

Slide 6

Slide 6 text

CQRS + Materialized-Views 1. Separate write and read to absorb differences 2. Deploy a query-optimized Materialized-View 7

Slide 7

Slide 7 text

Cosmos DB ChangeFeed + Azure Functions No need to implement mechanisms for CQRS Synchronized in near Real-time 8

Slide 8

Slide 8 text

Scalable ChangeFeed Centric Architecture Various services can be added starting from Change Feed 9

Slide 9

Slide 9 text

ChangeFeed Centric Case Study - JFE Engineering 10

Slide 10

Slide 10 text

2. Cosmos DB Change Feed-centric architecture Deep Dive 11

Slide 11

Slide 11 text

Two Change Feed usage patterns 1. Push model -> Data Transformation, Stream Processing 2. Pull model -> Batch Processing 12

Slide 12

Slide 12 text

Data Transformation, Stream Processing Used for processing to stream data with low latency The best solution is to use CosmosDBTrigger in Azure Functions For write-fast storage such as SQL Database and Redis Cache Also used when writing back to Cosmos DB (creating materialized view) 13

Slide 13

Slide 13 text

Sample code - Push model public class Function1 { public Function1(CosmosClient cosmosClient) { _container = cosmosClient.GetContainer("SampleDB", "MaterializedView"); } private readonly Container _container; [FunctionName("Function1")] public async Task Run([CosmosDBTrigger( databaseName: "SampleDB", collectionName: "TodoItems", LeaseCollectionName = "leases")] IReadOnlyList input, ILogger log) { var tasks = new Task[input.Count]; for (int i = 0; i < input.Count; i++) { // Change the partition key and write it back (actually, do advanced conversion) var partitionKey = new PartitionKey(input[i].GetPropertyValue("anotherKey")); tasks[i] = _container.UpsertItemStreamAsync(new MemoryStream(input[i].ToByteArray()), partitionKey); } await Task.WhenAll(tasks); } } 14

Slide 14

Slide 14 text

Batch Processing Use when you need to process a large amount of data at one time It is practical to implement it using TimerTrigger in Azure Functions Used for archiving to Blob Storage / Data Lake Storage Gen 2 Storage GPv2 and Data Lake Storage Gen 2 are charged by the number of write transactions, so writing stream data every time increases costs 15

Slide 15

Slide 15 text

Sample code - Pull model public class Function2 { public Function2(CosmosClient cosmosClient) { _container = cosmosClient.GetContainer("SampleDB", "TodoItems"); } private readonly Container _container; [FunctionName("Function2")] public async Task Run([TimerTrigger("0 */5 * * * *")] TimerInfo myTimer, ILogger log) { var continuationToken = await LoadContinuationTokenAsync(); var changeFeedStartFrom = continuationToken != null ? ChangeFeedStartFrom.ContinuationToken(continuationToken) : ChangeFeedStartFrom.Now(); var changeFeedIterator = _container.GetChangeFeedIterator(changeFeedStartFrom, ChangeFeedMode.Incremental); while (changeFeedIterator.HasMoreResults) { try { var items = await changeFeedIterator.ReadNextAsync(); // TODO: Implementation } catch (CosmosException ex) when (ex.StatusCode == HttpStatusCode.NotModified) { continuationToken ??= ex.Headers.ContinuationToken; break; } } await SaveContinuationTokenAsync(continuationToken); } } 16

Slide 16

Slide 16 text

For a reliable Change Feed-centric architecture Improving resiliency Idempotency and eventual consistency Avoid inconsistent states 17

Slide 17

Slide 17 text

Improving resiliency - Retry policy CosmosDBTrigger proceeds to the next Change Feed when an execution error occurs. Retry policy is used because data in case of failure will be lost without being processed again. Use FixedDelayRetry or ExponentialBackoffRetry with an unlimited ( -1 ) maximum number of retries. Change Feed will not proceed until successful, so no data will be lost. 18

Slide 18

Slide 18 text

Sample code - Retry policy public class Function1 { // infinity retry with 10 sec interval [FixedDelayRetry(-1, "00:00:10")] [FunctionName("Function1")] public async Task Run([CosmosDBTrigger( databaseName: "SampleDB", collectionName: "TodoItems", LeaseCollectionName = "leases")] IReadOnlyList input) { // TODO: Implementation } } 19

Slide 19

Slide 19 text

Focus on idempotency and eventual consistency Coding for idempotency whenever possible For storage that can be overwrite or delete (Cosmos DB / SQL Database / etc) When it is difficult to ensure idempotency, focus on eventual consistency. Focus on "At least once" For storage that can only be append (Blob Storage / Data Lake Storage Gen 2) 20

Slide 20

Slide 20 text

Avoid inconsistent states - Graceful shutdown Azure Functions will be restarted when a new version is deployed or platform is updated If the host is restarted while executing a Function, the states may be inconsistent Implement Graceful shutdown to avoid inconsistent states Increase resiliency by combining with Retry policy 21

Slide 21

Slide 21 text

Sample code - Graceful shutdown public class Function1 { // infinity retry with 10 sec interval [FixedDelayRetry(-1, "00:00:10")] [FunctionName("Function1")] public async Task Run([CosmosDBTrigger( databaseName: "SampleDB", collectionName: "TodoItems", LeaseCollectionName = "leases")] IReadOnlyList input, CancellationToken cancellationToken) { try { // Pass cancellation token await Task.Delay(TimeSpan.FromSeconds(5), cancellationToken); } catch (OperationCanceledException) { // TODO: Implement rollback throw; } } } 22

Slide 22

Slide 22 text

References Azure Cosmos DB trigger for Functions 2.x and higher | Microsoft Docs Azure/azure-cosmos-dotnet-v3: .NET SDK for Azure Cosmos DB for the core SQL API Change feed pull model | Microsoft Docs Azure Functions error handling and retry guidance | Microsoft Docs Cancellation tokens - Develop C# class library functions using Azure Functions | Microsoft Docs 23