Upgrade to Pro — share decks privately, control downloads, hide ads and more …

H2O Aunt Sally

H2O Aunt Sally

This was the first deck I put together to socialise the core ideas of building a new version of the Hailo software stack.

Our CTO (Rorie) had asked me to think about this with the brief: "Allow Hailo to scale the business along three axis: adding features to our current business, adding cities and brand new stuff."

The last modified date on this is 15th May 2013. I gave an update to the board in November 2013 where I talked about key milestones: first commit was May 31st, by November 8th we had entire job flow working integrated with a new driver app, and powered by around 60 microservices!

Dave Gardner

May 15, 2013
Tweet

More Decks by Dave Gardner

Other Decks in Technology

Transcript

  1. H20 Aunt Sally

    View Slide

  2. Aunt Sally is a traditional throwing game.

    The term is often used metaphorically to mean something that is a target for
    criticism.

    View Slide

  3. 0 / Ambitions
    Provide a simple framework for us to build an efficient, resilient,
    second generation Hailo (aka h20)
    Allow Hailo to scale the business along three axis: adding features
    to our current business, adding cities and brand new stuff
    Solve pain points in our current architecture

    Be productive

    View Slide

  4. 0 / Logical design
    API tier
    Orchestration tier
    Services tier
    Data storage tier
    • Exposed to the outside world
    • All clients talk to this layer, via HTTP
    • Permitted to talk to orchestration and
    services tiers

    View Slide

  5. 0 / Logical design
    API tier
    Orchestration tier
    Services tier
    Data storage tier
    • Services that coordinate flows across
    multiple services, in a reusable way
    • An example would be registration
    where a record is created via one
    service and then a confirmation email
    sent via another
    • Permitted to talk to the service tier
    and able to leverage messaging
    solutions
    • Should never talk to data storage and
    API tiers

    View Slide

  6. 0 / Logical design
    API tier
    Orchestration tier
    Services tier
    Data storage tier
    • Core “leaf” services
    • An example would be a customer
    service that does CRUD on customer
    records
    • Permitted to talk to the data storage
    tier
    • Should never talk to any other services
    in any tier (including other core
    services)

    View Slide

  7. 0 / Logical design
    API tier
    Orchestration tier
    Services tier
    Data storage tier
    • Storage engines such as Cassandra
    • Only accessible from core services
    • Services never share state (database),
    so each core service is solely
    responsible for its own data access

    View Slide

  8. 0 / Physical design
    Client Service
    Messaging system
    protobuf
    protobuf

    View Slide

  9. 1 / Feature
    Feature: retrieve details on customer 12313500143
    name service.customer.retrieve
    request message CustomerId {
    required uint64 id = 1;
    }
    response message Customer {
    required unit64 id = 1;
    required locale string = 2;
    }

    View Slide

  10. 1 / Feature
    import(
    “github.com/hailocab/goservice”
    msg “path/to/compiled/protobuf/retrieve”
    )
    func main() {
    goservice.Register(ns, reqDef, rspDef, func(req *msg.CustomerId) *msg.Customer {
    // do stuff here and build a *msg.Customer to return
    }
    }

    View Slide

  11. 1 / Feature
    • Global “endpoint” namespace in the form /{tier}/{service}/{foo}/{bar}…
    • Action-oriented namespace; there is no REST here thank you
    • Every feature endpoint must define a single request/response pair, in
    Protobuf
    • Interface is a simple function which accepts request object and returns
    response object

    View Slide

  12. 2 / Package
    A small collection of features; a micro-service.
    • create new customer
    • retrieve details on customer
    • update customer
    • delete customer

    View Slide

  13. 3 / Provision
    Github
    Build
    binary
    Stage QA Deploy
    Tag

    View Slide

  14. 3 / Provision
    • Fully automated from start to finish; “push-button” system
    • Same binary that goes through QA is deployed to production
    • Push-based deployment, uneven distribution (not every package has to
    run on every server)

    View Slide

  15. 4 / Register
    pkg A pkg A pkg A
    Register with message bus:


    “I am here and ready for
    messages”
    Server 1 Server 2 Server 3

    View Slide

  16. 4 / Register
    Register with ZK, by
    creating an Ephemeral
    node:


    “I am running on this
    server”
    Zookeeper
    pkg A pkg A pkg A
    Server 1 Server 2 Server 3

    View Slide

  17. 4 / Register
    • Zookeeper keeps track of every package running
    • Registration via ZK ephemeral node, so failure will automatically
    remove
    • Information in ZK includes version number
    • Message bus keeps track of connected “workers” (packages)

    View Slide

  18. 5 / Use
    Client
    Local router
    and broker
    protobuf
    protobuf
    (Most things are a client)

    View Slide

  19. 5 / Use
    import(
    “github.com/hailocab/goservice”
    msg “path/to/compiled/protobuf/retrieve”
    )
    func main() {
    rsp := goservice.Sync(ns, req)
    rspChan := make(chan *goservice.Response)
    goservice.Async(ns, req, rspChan)
    goservice.Async(otherNs, otherReq, rspChan)
    rsp1 := <- rspChan
    rsp2 := <- rspChan
    }

    View Slide

  20. 5 / Use
    • Local broker/router running on every server
    • Clients (APIs, services) send their requests to the local router; it is then
    up to the router to decide which worker to send it to for processing
    • Router can make decisions based on:
    • properties of the request (mapping request namespace to the right
    service)
    • external configuration such as “send more traffic to version A of
    this package”
    • health and speed of connected workers

    View Slide

  21. 6 / Monitor
    • Each server exposes a single HTTP web service for querying health of
    packages deployed on that box, plus the health of the router/broker
    • Focus on pulling metrics and graphs together to form actionable
    intelligence, presented in the same place as fully automated
    “actions” (buttons to make changes)

    View Slide

  22. 6 / Monitor

    View Slide

  23. 7 / Debug
    • Unified logging format and full log aggregation
    • Inspect metrics and activity on individual deployed binaries via common
    tooling (CLI interface)
    `h20 list foo/bar` list all active packages matching name foo/bar
    `h20 stats ` get stats of a single specific package running on a single
    server

    View Slide

  24. 8 / Control
    • Real-time control over the network of routers/brokers
    • Manual traffic shaping to avoid a specific machine
    • A/B test two versions of the same package
    • Gradual rollout of a new version of a package, gradually shifting all
    traffic to a new version and then retiring the old
    • Automatic monitoring of provisioning and launching of binaries

    View Slide

  25. 9 / Summary
    Simple concepts, reused everywhere, all the way down the stack
    Make my request 


    protobuf
    Send it to this thing


    apps to single API,
    everything else to local
    broker
    Get back my response


    protobuf
    Request Broker Response

    View Slide