Slide 1

Slide 1 text

Message processing failed…. But what’s the root cause? LAILA BOUGRIA @noctovis @lailabougria

Slide 2

Slide 2 text

Message processing failed…. But what’s the root cause? LAILA BOUGRIA @noctovis @lailabougria

Slide 3

Slide 3 text

 Retail business  Decouple the system  Distributed system  Team Lorem, Ipsum and Dolor  Messaging Once upon a time..

Slide 4

Slide 4 text

Place order process 1 Store order Charge credit card 2 Package order 3 Ship the order 4 Bill order 5 Adjust stock 6 Verify customer status 8 Order more stock? 7 Team Lorem Team Ipsum Team Dolor

Slide 5

Slide 5 text

Billing OrderBilled Inventory UpdateStock StockUpdated Shipping OrderPackaged OrderShipped Marketing VerifyCustomerStatus CreateDiscountCode Messages Sales PlaceOrder OrderPlaced Finance ChargeOrder OrderPaid

Slide 6

Slide 6 text

Interactions Billing Marketing Inventory PlaceOrder Finance OrderCharged When OrderPlaced BillOrder OrderBilled When OrderPlaced StockUpdated Sales OrderPlaced When OrderCharged When OrderPlaced VerifyCustomerStatus Shipping When OrderPlaced OrderPackaged When OrderPackaged OrderShipped UpdateStock

Slide 7

Slide 7 text

 Autonomous teams  Operate in isolation  Interactions are based on a contract  Evolve independently Autonomous components

Slide 8

Slide 8 text

Something fails Immediate Retry is going to retry message 'abfbbdb5-15fe-4236-a52c-ae92013ec1c9' because of an exception: System.InvalidOperationException: Sequence contains no matching element at System.Linq.ThrowHelper.ThrowNoMatchException() at System.Linq.Enumerable.Single[TSource](IEnumerable`1 source, Func`2 predicate) at Inventory.UpdateProductStockHandler.Handle(UpdateProductStock message, IMessageHandlerContext context) in \src\Inventory\UpdateProductStockHandler.cs:line 14

Slide 9

Slide 9 text

Something fails What happened? Where? In what context? Why?

Slide 10

Slide 10 text

More questions…

Slide 11

Slide 11 text

Debugging a distributed system… Call stack F11 Start entire solution Start / Attach to process Logging Is this message mine? Forgot a breakpoint

Slide 12

Slide 12 text

Why debug?

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

Rethinking the AAA-syntax  Arrange: prepare message  Act: invoke the message handler  Assert:  Desired outcome in any data modifications  Outgoing messages

Slide 15

Slide 15 text

What about the order?

Slide 16

Slide 16 text

An example Inventory ChargeOrder Finance OrderCharged When OrderPlaced UpdateStock StockUpdated Sales OrderPlaced When OrderCharged Shipping When OrderPlaced OrderPackaged PlaceOrder ReserveStock

Slide 17

Slide 17 text

An example ChargeOrder Finance OrderCharged Sales OrderPlaced When OrderCharged UpdateStock PlaceOrder ReserveStock Inventory When OrderPackaged StockUpdated StockReserved InsufficientStock Shipping OrderPackaged When StockReserved When InsufficientStock SendPartiallyRefund OrderPartiallyRefunded RefundPartOfOrder

Slide 18

Slide 18 text

Recap  Test, test, and then test, again  Assert outgoing messages  Be aware of assumptions  Expect out-of-order messages

Slide 19

Slide 19 text

A message just failed processing on production…

Slide 20

Slide 20 text

Example UpdateStock InvalidOperationException Product does not exist OrderPackaged foreach (var orderline in order.orderlines) Send( orderLine.ProductId); OrderPlaced Product Ids were incorrectly mapped

Slide 21

Slide 21 text

 Looking at a single step in insufficient  Follow business flow through:  All participating & interacting components  Including context propagation We need the big picture

Slide 22

Slide 22 text

Distributed tracing  Early 2000’s  Independent of architectural styles  Decoupled systems  Root cause of problems

Slide 23

Slide 23 text

 Many frameworks and tools  Vendor-specific  Need for standardization Distributed tracing = +

Slide 24

Slide 24 text

 Observability framework  Open source  Tools, APIs and SDKs  Collect and export telemetry data  Cross-platform, cross-runtime OpenTelemetry

Slide 25

Slide 25 text

 Inner workings  Expected and unexpected system state  Without changing any code  External tools Observability

Slide 26

Slide 26 text

Observability signals Traces Metrics Logs

Slide 27

Slide 27 text

 Tracks progression of a request  Tree of spans  So, what’s a span?  A unit of work in the trace  Has a context What is a trace?

Slide 28

Slide 28 text

TraceId: 84b2b44f79e4434fab19ea9362f6536c Client Sales Finance PlaceOrder 9093d82f33eb41d0 PlaceOrder b7b82293d1356b8a ChargeOrder 8d44929e2cc48fe3 ChargeOrder b8dc83c0dee60327 OrderCharged 921e321d61a5ea0e OrderPlaced a4004dd28ae952af orders/create 8a381a70bd9d6d42 Payment Bc0ea953da2d03c9 OrderPlaced 8b63e7af24719e6c OrderPlaced 9276846f6f5e0506 OrderBilled a74700ae9c1fa359 OrderPackaged 8d133f57cf7a5475 Billing Shipping OrderCharged a4004dd28ae952af PackageOrder 9f94fdf6c6871368

Slide 29

Slide 29 text

TraceId: 84b2b44f79e4434fab19ea9362f6536c Client Sales Finance PlaceOrder 9093d82f33eb41d0 PlaceOrder b7b82293d1356b8a ChargeOrder 8d44929e2cc48fe3 ChargeOrder b8dc83c0dee60327 OrderCharged 921e321d61a5ea0e OrderPlaced a4004dd28ae952af orders/create 8a381a70bd9d6d42 Payment Bc0ea953da2d03c9 OrderPlaced 8b63e7af24719e6c OrderPlaced 9276846f6f5e0506 OrderBilled a74700ae9c1fa359 OrderPackaged 8d133f57cf7a5475 Billing Shipping OrderCharged a4004dd28ae952af PackageOrder 9f94fdf6c6871368

Slide 30

Slide 30 text

TraceId: 84b2b44f79e4434fab19ea9362f6536c Client Sales Finance PlaceOrder 9093d82f33eb41d0 PlaceOrder b7b82293d1356b8a ChargeOrder 8d44929e2cc48fe3 ChargeOrder b8dc83c0dee60327 OrderCharged 921e321d61a5ea0e OrderPlaced a4004dd28ae952af orders/create 8a381a70bd9d6d42 Payment Bc0ea953da2d03c9 OrderPlaced 8b63e7af24719e6c OrderPlaced 9276846f6f5e0506 OrderBilled a74700ae9c1fa359 OrderPackaged 8d133f57cf7a5475 Billing Shipping OrderCharged a4004dd28ae952af PackageOrder 9f94fdf6c6871368 Service: Sales | Duration: 1.69s Tags messaging.message_id 848af954-cae1-426a-a101-ae92013e1310 messaging.destination Finance messaging.destination_kind Queue order.id f2c0ada9-a148-4390-b488-1c94a5960fc9 order.amount 35.99 € order.paymentmethod CreditCard …

Slide 31

Slide 31 text

TraceId: 84b2b44f79e4434fab19ea9362f6536c Sales Finance OrderCharged 921e321d61a5ea0e OrderPlaced a4004dd28ae952af OrderPlaced 8b63e7af24719e6c OrderPlaced 9276846f6f5e0506 OrderBilled a74700ae9c1fa359 OrderPackaged 8d133f57cf7a5475 Billing Shipping OrderCharged a4004dd28ae952af PackageOrder 9f94fdf6c6871368 Inventory OrderPackaged 8ca6599733263101 UpdateProductStock 9201210e48d25d39 UpdateProductStock 9129e854d8597dda UpdateProductStock af46300f5ae8ddad UpdateProductStock 9ede452a93abd3e4 Service: Sales | Duration: 1.69s Tags messaging.conversation_id fb2c6eba-1d59-4dc6-9304-ae92013e1311 messaging.message_id 848af954-cae1-426a-a101-ae92013e1310 messaging.destination Sales order.details {"OrderId" : "f2c0ada9-a148-4390-b488-1c94a5960fc9", "Lines" : [ { "ProductId" : "f2c0ada9-a148-4390-b488-1c94a5960fc9", "Quantity" : "5" }, { "ProductId" : "f2c0ada9-a148-4390-b488-1c94a5960fc9", "Quantity" : "1" } ] } …

Slide 32

Slide 32 text

 Propagation through standardized protocol  W3C Trace Context for HTTP headers 00-84b2b44f79e4434fab19ea9362f6536c-8a381a70bd9d6d42-01 How is a distributed trace created? Version TraceId SpanId Trace flags

Slide 33

Slide 33 text

 Built into .NET  System.Diagnostics API  System.Diagnostics.DiagnosticsSource Nuget package  Start collecting telemetry  Select sources  Set up an exporter Getting started

Slide 34

Slide 34 text

Getting started

Slide 35

Slide 35 text

Exporters & tools

Slide 36

Slide 36 text

OpenTelemetry support HTTP Client NServiceBus Azure SDK gRPC Redis Entity Framework Core ASP.NET Core SQL Client AWS

Slide 37

Slide 37 text

.NET implementation ActivityContext Activity Tracer Span SpanContext ActivitySource

Slide 38

Slide 38 text

What questions should my traces answer?  Can you connect a user action to a trace?  Can you easily group similar traces together?  Can you identify the most load generating operations?  Can you identify which users are stressing the system?  Can you find suspicious events throughout the system?

Slide 39

Slide 39 text

ActivitySource

Slide 40

Slide 40 text

No content

Slide 41

Slide 41 text

Logging vs tracing  Should we stop logging alltogether?  Keep trace tags high-level  Log the details  Connect traces and logs

Slide 42

Slide 42 text

Additional benefits  Failure investigation  Postmortems  Feature flagging  Chaos engineering

Slide 43

Slide 43 text

 Observability is crucial  Investigate telemetry emitted by frameworks  Connect your existing logs  Enhance & enrich Recap

Slide 44

Slide 44 text

@noctovis @lailabougria @lailabougria