Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ITT 2018 - Randy Shoup - Service Architectures at Scale

ITT 2018 - Randy Shoup - Service Architectures at Scale

Over time, almost all large, well-known web sites have evolved their architectures from an early monolithic application to a loosely-coupled ecosystem of polyglot microservices. This session will discuss modern service architectures at scale, using specific examples from both Google and eBay, and covering lessons learned in building and operating these sites.

We will first talk about designing and building services, using some techniques from domain-driven design to decide the boundaries and interface of our services. We will discuss using event-driven techniques to communicate between services asynchronously. And finally, we will talk about how we put all of this together into a functioning service ecosystem that can scale to hundreds of teams, thousands of developers, and millions of users.

Istanbul Tech Talks

April 17, 2018
Tweet

More Decks by Istanbul Tech Talks

Other Decks in Programming

Transcript

  1. Service Architectures
    at Scale
    Randy Shoup
    @randyshoup
    linkedin.com/in/randyshoup

    View Slide

  2. Background
    •  VP Engineering at WeWork
    o  Physical space as a service
    •  VP Engineering at Stitch Fix
    o  Revolutionizing retail using data science and machine learning
    •  Director of Engineering for Google App Engine
    o  World’s largest Platform-as-a-Service
    •  Chief Engineer at eBay
    o  Multiple generations of eBay’s infrastructure

    View Slide

  3. Architecture Evolution
    •  eBay
    •  5th generation today
    •  Monolithic Perl à Monolithic C++ à Java à microservices
    •  Twitter
    •  3rd generation today
    •  Monolithic Rails à JS / Rails / Scala à microservices
    •  Amazon
    •  Nth generation today
    •  Monolithic C++ à Java / Scala à microservices

    View Slide

  4. No one starts with microservices

    Past a certain scale, everyone
    ends up with microservices

    View Slide

  5. First Law of Distributed Object
    Design:
    Don’t distribute your objects!
    -- Martin Fowler

    View Slide

  6. If you don’t end up regretting
    your early technology
    decisions, you probably over-
    engineered.

    View Slide

  7. Service Architecture
    •  Service Design
    •  Service Ecosystem
    •  Service Operation and Deployment

    View Slide

  8. Service Architecture
    •  Service Design
    •  Service Ecosystem
    •  Service Operation and Deployment

    View Slide

  9. Well-Engineered Microservices
    •  Single-purpose
    •  Simple, well-defined interface
    •  Modular and independent
    •  Isolated persistence (!)
    A
    C D E
    B

    View Slide

  10. Service as System of Record
    •  Single System of Record
    o  Every piece of data is owned by a single service
    o  That service is the canonical system of record for that data
    •  Every other copy is a read-only, non-authoritative cache
    @randyshoup linkedin.com/in/randyshoup
    customer-service
    styling-service
    customer-search
    billing-service

    View Slide

  11. Service Interface
    •  A service interface includes
    o  Synchronous request-response (REST, gRPC, etc)
    o  Events the service produces
    o  Events the service consumes
    o  Bulk reads and writes (ETL)
    •  Events are a first-class part of a service interface
    •  The interface includes any mechanism for getting data
    in or out of the service (!)
    @randyshoup linkedin.com/in/randyshoup

    View Slide

  12. Service Interfaces / Prototypes
    •  Define Service Interface (Formally!)
    o  Propose
    o  Discuss with client(s)
    o  Agree
    •  Prototype Implementation
    o  Limited investment of time and people
    o  Simplest thing that could possibly work
    o  Client can integrate with prototype
    o  Implementor can learn what works and what does not

    View Slide

  13. Service Interfaces / Prototypes
    •  Real Implementation
    o  Throw away the prototype (!)
    •  è Rinse and Repeat

    View Slide

  14. Maintaining Interface Stability
    •  Backward / forward compatibility of interfaces
    o  Can *never* break your clients’ code
    •  Semantic versioning (major.minor.patch)
    o  Often multiple interface versions
    o  Sometimes multiple deployments
    •  Explicit deprecation policy
    o  Strong incentive to wean customers off old versions (!)

    View Slide

  15. Service Design Anti-Patterns
    •  The “Mega-Service”
    o  Overbroad area of responsibility is difficult to reason about, change
    o  Leads to more upstream / downstream dependencies
    •  “Leaky Abstraction” Service
    o  Interface reflects provider’s implementation, not the consumer’s model
    o  Consumer’s model is typically more aligned with the domain, simpler, more
    abstract
    o  Leaking provider’s model in the interface constrains evolution of the
    implementation

    View Slide

  16. Service Design Anti-Patterns
    •  “Client-less” Service
    o  Service built without a specific client in mind
    o  Very difficult to design a minimal, complete interface without a tangible use-
    case
    o  Opportunity cost and wasted effort
    o  Usage is the true metric of the value of a service
    •  Shared persistence
    o  Breaks encapsulation, encourages “backdoor” interface violations
    o  Unhealthy and near-invisible coupling of services
    o  (-) Initial eBay SOA efforts

    View Slide

  17. Service Architecture
    •  Service Design
    •  Service Ecosystem
    •  Service Operation and Deployment

    View Slide

  18. Ecosystem of Services
    •  Hundreds to thousands of
    independent services
    •  Many layers of dependencies,
    no strict tiers
    •  Graph of relationships, not a
    hierarchy
    C
    B
    A E
    F
    G
    D

    View Slide

  19. Google Service Layering
    •  Cloud Datastore: NoSQL service
    o  Strong transactional consistency
    o  SQL-like rich query capabilities
    •  Megastore: geo-scale structured database
    o  Multi-row transactions
    o  Synchronous cross-datacenter replication
    •  Bigtable: cluster-level structured storage
    o  (row, column, timestamp) -> cell contents
    •  Colossus: distributed file system
    o  Block distribution and replication
    •  Borg: cluster management infrastructure
    o  Task scheduling, machine assignment
    Cloud Datastore
    Megastore
    Bigtable
    Colossus
    Cluster manager

    View Slide

  20. Evolution, not Central Control
    •  No centralized design or approval
    o  Most technology decisions made locally instead of globally
    •  Build services as needed
    o  Create / extract new services when needed to solve a problem
    o  Services justify their continued existence through usage
    o  Deprecate services when no longer used
    •  Appearance of clean layering is an emergent property

    View Slide

  21. “Every service at Google is
    either deprecated or not ready
    yet.”
    -- Google engineering proverb

    View Slide

  22. Standardization
    •  Standardized communication
    o  Network protocols
    o  Data formats
    o  Interface schema / specification
    •  Standardized infrastructure
    o  Source control
    o  Configuration management
    o  Cluster management
    o  Monitoring, alerting, diagnosing, etc.

    View Slide

  23. “Enforcing” Standardization
    •  Encouraged via
    o  Libraries
    o  Support in underlying services
    o  Code reviews
    o  Searchable code

    View Slide

  24. The easiest way to “enforce” a
    standard practice is with
    working code.

    View Slide

  25. In a healthy service ecosystem,
    standards become standards
    by being better than the
    alternatives.

    View Slide

  26. Service Relationships
    •  Vendor – Customer Relationship
    o  Friendly and cooperative, but structured
    o  Clear ownership and division of responsibility
    •  Customer Focus
    o  Value of service comes from its value to its customers
    •  Customer can choose to use service or not (!)
    o  Must be strictly better than the alternatives of build, buy, borrow

    View Slide

  27. Service Relationships
    •  Competition keeps system healthy
    o  If a common service is no longer useful, we can redeploy its resources
    •  Incentives are aligned
    o  Usage is an objective measure of a service’s success

    View Slide

  28. Service Architecture
    •  Service Design
    •  Service Ecosystem
    •  Service Operation and Deployment

    View Slide

  29. Goals of a Service Owner
    •  Meet the needs of my clients …
    •  Functionality
    •  Quality
    •  Performance
    •  Stability and reliability
    •  Constant improvement over time
    •  … at minimum cost and effort
    •  Leverage common tools and infrastructure
    •  Leverage other services
    •  Automate building, deploying, and operating my service
    •  Optimize for efficient use of resources

    View Slide

  30. Bounded Context
    •  Primary focus on my service
    o  Clients which depend on my service
    o  Services which my service depends on
    •  Very little worry about
    o  The complete ecosystem
    o  The underlying infrastructure
    •  Cognitive load is very bounded
    •  è Small, nimble service teams
    Service
    Client
    A
    Client
    B
    Client
    C

    View Slide

  31. Incremental Change
    •  Decompose every change into incremental steps
    •  Each step maintains backward / forward
    compatibility of data and interfaces
    •  Multiple service versions commonly coexist
    o  Every change is a rolling upgrade
    o  Transitional states are normal, not exceptional

    View Slide

  32. Example: Data Migration
    •  Requires dual data processing and storage (“dual
    writes”)
    •  Careful set of transitional steps
    o  A è A | B è B | A è B
    A
    A
    A A A A
    A
    B
    B B
    B
    B
    B B
    B B
    B
    B
    B A
    A
    A B
    B B
    B

    View Slide

  33. Service Deployment
    •  Automated Pipeline
    o  Push-button, repeatable pipeline of build, test, deploy
    •  Independent Deployment
    o  Each service deployed independently, *not* as a group
    •  Incremental Deployment
    o  Canary testing
    o  Staged rollouts by %

    View Slide

  34. Service Deployment
    •  “Feature Flags”
    o  Decouple code deployment from feature deployment
    o  Rapidly turn on / off features without redeploying code
    o  Typically deploy with feature turned off, then turn on as a separate step

    View Slide

  35. Service Architecture
    •  Service Design
    •  Service Ecosystem
    •  Service Operation and Deployment

    View Slide

  36. Thank You!
    •  @randyshoup
    •  linkedin.com/in/randyshoup

    View Slide