Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building APIs for an unreliable world

Building APIs for an unreliable world

Networks and computing systems are unreliable. Connections drop or slow to a crawl. Hard drives break and systems turn off without warning. These issues affect public and private APIs alike, but can spell disaster for microservices, as unreliability can lead to inconsistency. In this talk, we will explore strategies for designing APIs that can help mitigate unreliability. We will look at how the details of an implementation can influence resilience. Finally, we will see how design and implementation influence each other for creating an API that is best suited to tackle unreliability.

James Bowes

November 30, 2018
Tweet

More Decks by James Bowes

Other Decks in Programming

Transcript

  1. FIND ME github.com/jbowes twitter.com/jrbowes James Bowes ABOUT ME • Technical

    Lead @ www.manifold.co • Loves cats (dogs are great, too!) • Mostly a backend developer @jrbowes
  2. @jrbowes THE SAMPLE PLATTER Four delicious APIs, each with its

    own: • Use case • Spectacular failure • Pragmatic solution • Useful lesson
  3. WHAT IS ? UNRELIABLE ANIMALS @jrbowes Photo by Gary Bendig

    on Unsplash 114 RACCOON RELATED OUTAGES (https://cybersquirrel1.com/)
  4. @jrbowes What We Built • An on-premises systems management and

    software distribution product • Let users upload their own Linux packages File Transfers
  5. @jrbowes How It Failed • People are impatient • Users

    had very large uploads • Disks filled up File Transfers
  6. @jrbowes How We Fixed It • Chunked transfers • A

    new api to query what the server had • Resumable uploads File Transfers
  7. @jrbowes What We Built • A Realtime Backend-as-a-Service • Built-in

    identity with OAuth (GitHub, Twitter) • Shared tree-based data structure Collaborative Editing
  8. @jrbowes How It Failed • The most recent change always

    won • ...Even if that change was made against an old state Collaborative Editing
  9. @jrbowes Operational Transformation Com m utative Replicated Data Types Convergent

    Replicated Data Types Optimistic Locking Pessimistic Locking
  10. @jrbowes How We Fixed It • Target common use cases

    • Append on arrays, insert on maps • Optimistic locking (Conditional set) Collaborative Editing
  11. @jrbowes What We Would Have Done • Implement all those

    cool data structures! • … As opt-in features Collaborative Editing
  12. @jrbowes What We Built • An API to buy SaaS

    products (databases, etc) • Every call == Purchases
  13. @jrbowes How It Failed • It didn’t! Purchases • ...until

    it did • A failure state could charge a customer multiple times
  14. @jrbowes How We Fixed It • Repeated calls with the

    same unique key achieve the same result • Idempotency! • PUT vs POST Purchases
  15. At Least Once At Least Once At Least Once At

    Least Once @jrbowes At Most Once
  16. @jrbowes What We Built • Lots of things! • Any

    system using (micro)services • Gateway/aggregation layer Coordinating Responses
  17. @jrbowes How It Failed • Constructing a response required N

    services • One service fails? Error the whole response Coordinating Responses
  18. @jrbowes How We Fixed It • Evaluate the API and

    data • Denormalize where possible • Make fields optional and omit on error • The microservice implementation is leaky Coordinating Responses
  19. • Anticipate The User’s Needs • Contention? Use Conditional Updates

    • Pick Your Delivery Guarantees • Tolerate Service Failure @jrbowes