Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ITT 2018 - Randy Shoup - Service Architectures at Scale

ITT 2018 - Randy Shoup - Service Architectures at Scale

Over time, almost all large, well-known web sites have evolved their architectures from an early monolithic application to a loosely-coupled ecosystem of polyglot microservices. This session will discuss modern service architectures at scale, using specific examples from both Google and eBay, and covering lessons learned in building and operating these sites.

We will first talk about designing and building services, using some techniques from domain-driven design to decide the boundaries and interface of our services. We will discuss using event-driven techniques to communicate between services asynchronously. And finally, we will talk about how we put all of this together into a functioning service ecosystem that can scale to hundreds of teams, thousands of developers, and millions of users.

Istanbul Tech Talks

April 17, 2018

More Decks by Istanbul Tech Talks

Other Decks in Programming


  1. Background •  VP Engineering at WeWork o  Physical space as

    a service •  VP Engineering at Stitch Fix o  Revolutionizing retail using data science and machine learning •  Director of Engineering for Google App Engine o  World’s largest Platform-as-a-Service •  Chief Engineer at eBay o  Multiple generations of eBay’s infrastructure
  2. Architecture Evolution •  eBay •  5th generation today •  Monolithic

    Perl à Monolithic C++ à Java à microservices •  Twitter •  3rd generation today •  Monolithic Rails à JS / Rails / Scala à microservices •  Amazon •  Nth generation today •  Monolithic C++ à Java / Scala à microservices
  3. Service as System of Record •  Single System of Record

    o  Every piece of data is owned by a single service o  That service is the canonical system of record for that data •  Every other copy is a read-only, non-authoritative cache @randyshoup linkedin.com/in/randyshoup customer-service styling-service customer-search billing-service
  4. Service Interface •  A service interface includes o  Synchronous request-response

    (REST, gRPC, etc) o  Events the service produces o  Events the service consumes o  Bulk reads and writes (ETL) •  Events are a first-class part of a service interface •  The interface includes any mechanism for getting data in or out of the service (!) @randyshoup linkedin.com/in/randyshoup
  5. Service Interfaces / Prototypes •  Define Service Interface (Formally!) o 

    Propose o  Discuss with client(s) o  Agree •  Prototype Implementation o  Limited investment of time and people o  Simplest thing that could possibly work o  Client can integrate with prototype o  Implementor can learn what works and what does not
  6. Maintaining Interface Stability •  Backward / forward compatibility of interfaces

    o  Can *never* break your clients’ code •  Semantic versioning (major.minor.patch) o  Often multiple interface versions o  Sometimes multiple deployments •  Explicit deprecation policy o  Strong incentive to wean customers off old versions (!)
  7. Service Design Anti-Patterns •  The “Mega-Service” o  Overbroad area of

    responsibility is difficult to reason about, change o  Leads to more upstream / downstream dependencies •  “Leaky Abstraction” Service o  Interface reflects provider’s implementation, not the consumer’s model o  Consumer’s model is typically more aligned with the domain, simpler, more abstract o  Leaking provider’s model in the interface constrains evolution of the implementation
  8. Service Design Anti-Patterns •  “Client-less” Service o  Service built without

    a specific client in mind o  Very difficult to design a minimal, complete interface without a tangible use- case o  Opportunity cost and wasted effort o  Usage is the true metric of the value of a service •  Shared persistence o  Breaks encapsulation, encourages “backdoor” interface violations o  Unhealthy and near-invisible coupling of services o  (-) Initial eBay SOA efforts
  9. Ecosystem of Services •  Hundreds to thousands of independent services

    •  Many layers of dependencies, no strict tiers •  Graph of relationships, not a hierarchy C B A E F G D
  10. Google Service Layering •  Cloud Datastore: NoSQL service o  Strong

    transactional consistency o  SQL-like rich query capabilities •  Megastore: geo-scale structured database o  Multi-row transactions o  Synchronous cross-datacenter replication •  Bigtable: cluster-level structured storage o  (row, column, timestamp) -> cell contents •  Colossus: distributed file system o  Block distribution and replication •  Borg: cluster management infrastructure o  Task scheduling, machine assignment Cloud Datastore Megastore Bigtable Colossus Cluster manager
  11. Evolution, not Central Control •  No centralized design or approval

    o  Most technology decisions made locally instead of globally •  Build services as needed o  Create / extract new services when needed to solve a problem o  Services justify their continued existence through usage o  Deprecate services when no longer used •  Appearance of clean layering is an emergent property
  12. “Every service at Google is either deprecated or not ready

    yet.” -- Google engineering proverb
  13. Standardization •  Standardized communication o  Network protocols o  Data formats

    o  Interface schema / specification •  Standardized infrastructure o  Source control o  Configuration management o  Cluster management o  Monitoring, alerting, diagnosing, etc.
  14. “Enforcing” Standardization •  Encouraged via o  Libraries o  Support in

    underlying services o  Code reviews o  Searchable code
  15. Service Relationships •  Vendor – Customer Relationship o  Friendly and

    cooperative, but structured o  Clear ownership and division of responsibility •  Customer Focus o  Value of service comes from its value to its customers •  Customer can choose to use service or not (!) o  Must be strictly better than the alternatives of build, buy, borrow
  16. Service Relationships •  Competition keeps system healthy o  If a

    common service is no longer useful, we can redeploy its resources •  Incentives are aligned o  Usage is an objective measure of a service’s success
  17. Goals of a Service Owner •  Meet the needs of

    my clients … •  Functionality •  Quality •  Performance •  Stability and reliability •  Constant improvement over time •  … at minimum cost and effort •  Leverage common tools and infrastructure •  Leverage other services •  Automate building, deploying, and operating my service •  Optimize for efficient use of resources
  18. Bounded Context •  Primary focus on my service o  Clients

    which depend on my service o  Services which my service depends on •  Very little worry about o  The complete ecosystem o  The underlying infrastructure •  Cognitive load is very bounded •  è Small, nimble service teams Service Client A Client B Client C
  19. Incremental Change •  Decompose every change into incremental steps • 

    Each step maintains backward / forward compatibility of data and interfaces •  Multiple service versions commonly coexist o  Every change is a rolling upgrade o  Transitional states are normal, not exceptional
  20. Example: Data Migration •  Requires dual data processing and storage

    (“dual writes”) •  Careful set of transitional steps o  A è A | B è B | A è B A A A A A A A B B B B B B B B B B B B A A A B B B B
  21. Service Deployment •  Automated Pipeline o  Push-button, repeatable pipeline of

    build, test, deploy •  Independent Deployment o  Each service deployed independently, *not* as a group •  Incremental Deployment o  Canary testing o  Staged rollouts by %
  22. Service Deployment •  “Feature Flags” o  Decouple code deployment from

    feature deployment o  Rapidly turn on / off features without redeploying code o  Typically deploy with feature turned off, then turn on as a separate step