Upgrade to Pro — share decks privately, control downloads, hide ads and more …

You're Good to Go.

mattheath
September 09, 2014

You're Good to Go.

As Hailo expanded from a European to a Global business, we needed to re-evaluate our approach to technology. Moving away from a traditional monolithic PHP/Java application, and embracing a cloud native approach, Hailo transitioned to a new micro-service platform built almost entirely in Go.

mattheath

September 09, 2014
Tweet

More Decks by mattheath

Other Decks in Programming

Transcript

  1. YOU’RE GOOD TO GO.
    Matt Heath - Technical Lead, Platform
    @mattheath
    Go London User Group, Sep 2014

    View full-size slide

  2. FROM BARCELONA TO BOSTON,
    FROM TOKYO TO TORONTO.

    View full-size slide

  3. eu-west-1
    PHP
    Driver

    API
    Load Balancer Load Balancer
    PHP

    Cust

    API
    PHP
    Driver

    API
    Java

    Hailo
    Engine
    PHP
    Driver

    API
    PHP

    Cust

    API
    PHP

    Cust

    API
    MySQL
    Redis
    H1
    MySQL

    View full-size slide

  4. MySQL
    PHP
    Driver

    API
    Java

    Hailo
    Engine
    ELB
    MySQL
    PHP
    Driver

    API
    Java

    Hailo
    Engine
    ELB
    MySQL
    PHP
    Driver

    API
    Java

    Hailo
    Engine
    ELB
    eu-west-1 eu-west-1
    us-east-1
    PHP

    Cust

    API
    PHP

    Cust

    API
    PHP

    Cust

    API
    PHP

    Cust

    Service
    PHP

    Credits

    Service
    Java

    Pay
    Service
    ELB
    C* C* C*
    H1.5

    View full-size slide

  5. DEVELOPING NEW FEATURES
    UNCLEAR RESPONSIBILITIES
    SPOFS WITH SLOW FAILOVER
    LACK OF AUTOMATION

    View full-size slide

  6. “construct a highly agile and
    highly available service from
    ephemeral and assumed broken
    components”
    Adrian Cockcroſt

    View full-size slide

  7. Go

    Service
    Go

    Service
    Java
    Service
    ELB
    C* C* C*
    Go “Thin” API
    RabbitMQ Message Bus

    (federated clusters per AZ)
    us-east-1 eu-west-1
    H2
    Go

    Service
    Go

    Service
    Java
    Service
    ELB
    C* C* C*
    Go “Thin” API
    RabbitMQ Message Bus

    (federated clusters per AZ)

    View full-size slide

  8. Logic
    go-service-layer
    Storage
    go-platform-layer
    Handler
    Library for building services that talk Protobuf via RMQ
    Self-configuring external service adapters
    Services get for free:
    !
    • Provisioning
    • Service discovery
    • Configuration
    • Monitoring
    • Authentication/authorisation
    • AB testing
    • Self configuring connectivity 

    to third-party services
    H2
    Service

    View full-size slide

  9. Provisioning Service Provisioning Service Provisioning Service
    CI Pipeline (Janky/Jenkins) Amazon S3
    Rabbit MQ
    Provisioning Manager

    View full-size slide

  10. Provisioning Service
    Rabbit MQ
    Discovery
    Service
    Binding

    Service
    Provisioning Service
    New

    Service
    AUTOMATIC SERVICE DISCOVERY

    View full-size slide

  11. SMALL INDEPENDENT SERVICES
    SINGLE RESPONSIBILITY
    EASE PAIN, SCALE RAPIDLY

    View full-size slide

  12. CLOUD NATIVE / ANTIFRAGILE
    EXPECT FAILURE
    AUTOMATE EVERYTHING

    View full-size slide

  13. POINT IN POLYGONS
    IN-MEMORY CACHE, ATOMIC RCU UPDATES
    NON BLOCKING
    <200µs SEARCH TIME, 4ms ROUND TRIP
    ZONING SERVICE

    View full-size slide

  14. UNIQUE ID GENERATION
    77669839702851584
    41 bits Timestamp, ms precision, bespoke epoch
    10 bits Configured machine ID
    12 bits Sequence number

    View full-size slide

  15. DISTRIBUTED, HA, GEOSPATIAL SEARCH ENGINE
    INGESTS GPS UPDATES - HIGH WRITE VOLUME
    LOW LATENCY READS / SEARCHES (<5ms)
    RAZIEL

    View full-size slide

  16. INFRASTRUCTURE

    View full-size slide

  17. AUTOMATION
    GO SERVICES
    GOAMZ

    View full-size slide

  18. Environment
    Service
    “Meta” environment
    Compute
    “Test” environment
    “Staging” environment
    “Boop” environment
    Whisper
    Service
    Elastic IP
    Allocation
    Autoscaling
    DNS
    Configuration
    Updating
    Security
    Groups
    SNS

    View full-size slide

  19. !
    LOAD
    FAILURE
    DEGRADATION

    View full-size slide

  20. 15,000 JOBS/HOUR
    20,000 DRIVERS
    4,000+ REQ/S

    View full-size slide

  21. CONTINUOUS
    PRODUCTION
    TESTING

    View full-size slide

  22. DISTRIBUTED

    TRACING

    View full-size slide

  23. hailo~2~api api.v1.customer service.customer
    hailo~2~api api.v1.customer service.customer

    View full-size slide

  24. hailo~2~api api.v1.customer service.customer
    hailo~2~api api.v1.customer service.customer
    REQ
    REP
    REQ
    REP
    IN
    OUT
    IN
    OUT

    View full-size slide

  25. {
    "timestamp": 1410262798427145176,
    "traceId": "d30479b8-1491-4390-7cf5-4cd14bc4b765",
    "type": "OUT",
    "messageId": "a661f9ef-774c-49b2-6e74-cfed65f7d120",
    "parentMessageId": "",
    "from": "com.hailocab.hshell",
    "to": "com.hailocab.service.nearest-driver.search",
    "hostname": "ip-10-13-2-251",
    "az": "eu-west-1a",
    "handlerInstanceId": “server-com.hailocab.service.nearest-driver-18bd089e-8ef1-4ca1-75cb-8...c”,
    "duration": 11222094
    }
    {
    "timestamp": 1410262798416053450,
    "traceId": "d30479b8-1491-4390-7cf5-4cd14bc4b765",
    "type": "REQ",
    "messageId": "6404dd1e-c995-48a9-73dc-9edb1380f0bf",
    "parentMessageId": "a661f9ef-774c-49b2-6e74-cfed65f7d120",
    "from": "com.hailocab.service.nearest-driver",
    "to": "com.hailocab.service.zoning.search",
    "hostname": "ip-10-13-2-251",
    "az": "eu-west-1a"
    }

    View full-size slide

  26. hailo~2~api api.v1.customer service.customer
    REQ
    RSP
    REQ
    RSP
    IN
    OUT
    IN
    OUT
    NSQ

    View full-size slide

  27. // Req sends a request, and...
    func (c *Client) Req(req *Request, rsp proto.Message,
    options ...Options) errors.Error {
    !
    go c.traceReq(req)
    responseMsg, err := c.doReq(req, options...)
    go c.traceRsp(req, responseMsg, err)
    if err != nil {
    return err
    }
    !
    // Do other things...
    !
    return nil
    }

    View full-size slide

  28. // instrumentedHandler wraps the handler to provide instrumentation
    func (ep *Endpoint) instrumentedHandler(req *Request)
    (proto.Message, errors.Error) {
    !
    start := time.Now()
    var err errors.Error
    var msg proto.Message
    !
    // Defer panic handling
    defer func() {
    stats.Record(ep, err, time.Since(start))
    // oh crap i hope this never happens
    }()
    !
    // Execute handler
    go traceIn(req)
    msg, err = ep.Handler(req)
    go traceOut(req, msg, err, time.Since(start))
    return msg, err
    }

    View full-size slide

  29. // instrumentedHandler wraps the handler to provide instrumentation
    func (ep *Endpoint) instrumentedHandler(req *Request)
    (proto.Message, errors.Error) {
    !
    start := time.Now()
    var err errors.Error
    var msg proto.Message
    !
    // Defer panic handling
    defer func() {
    stats.Record(ep, err, time.Since(start))
    // oh crap i hope this never happens
    }()
    !
    // Execute handler
    go traceIn(req) // Don’t actually do this
    msg, err = ep.Handler(req)
    go traceOut(req, msg, err, time.Since(start))
    return msg, err
    }

    View full-size slide

  30. Phosphor
    Host Instances
    Publish
    Service A
    Trace Library
    goroutine
    chan
    UDP
    Service B
    Trace Library
    goroutine
    chan
    UDP
    Trace
    Service
    In-memory
    Aggregates
    Optional

    persistant
    storage
    Dashboards
    Monitoring

    View full-size slide

  31. Phosphor
    Host Instances
    Publish
    Service A
    Trace Library
    goroutine
    chan
    UDP
    Service B
    Trace Library
    goroutine
    chan
    UDP
    Trace
    Service
    In-memory
    Aggregates
    Optional

    persistant
    storage
    Dashboards
    Monitoring

    View full-size slide

  32. var traceChan chan []byte
    !
    func init() {
    // Use a buffered channel
    traceChan = make(chan []byte, 200)
    !
    // Fire off a background worker for this channel
    defaultClient = NewClient(traceChan)
    go defaultClient.publisher()
    }
    !
    // Send, drops trace if the backend is at capacity
    func Send(msg []byte) {
    select {
    case traceChan <- msg:
    // Success
    default:
    // Default case fired if channel is full
    // Ensures this is non blocking
    }
    }

    View full-size slide

  33. Phosphor
    Trace
    Service
    Host Instances
    Publish
    Service A
    Trace Library
    goroutine
    chan
    UDP
    Service B
    Trace Library
    goroutine
    chan
    UDP
    In-memory
    Aggregates
    Optional

    persistant
    storage
    Dashboards
    Monitoring

    View full-size slide

  34. func (w *worker) loop() {
    var b []byte
    !
    // Spin and forward on traces every time our
    // buffer fills, or when our time window elapses
    for {
    select {
    case b = <-w.ch:
    w.buf = append(w.buf, b)
    if len(w.buf) >= bufferSize {
    w.send()
    }
    case <-time.Tick(bufferWindow):
    w.send()
    }
    }
    }

    View full-size slide

  35. func (w *worker) loop() {
    var b []byte
    !
    // Spin and forward on traces every time our
    // buffer fills, or when our time window elapses
    for {
    select {
    case b = <-w.ch:
    w.buf = append(w.buf, b)
    if len(w.buf) >= bufferSize {
    w.send()
    }
    case <-time.Tick(bufferWindow): // Leaks goroutines
    w.send()
    }
    }
    }

    View full-size slide

  36. func (w *worker) loop() {
    var b []byte
    timeout := time.Tick(bufferWindow)
    !
    // Spin and forward on traces every time our
    // buffer fills, or when our time window elapses
    for {
    select {
    case b = <-w.ch:
    w.buf = append(w.buf, b)
    if len(w.buf) >= bufferSize {
    w.send()
    }
    case <-timeout:
    w.send()
    }
    }
    }

    View full-size slide

  37. func (w *worker) loop() {
    var b []byte
    timeout := time.NewTicker(bufferWindow)
    defer timeout.Stop() // Bonus points
    !
    // Spin and forward on traces every time our
    // buffer fills, or when our time window elapses
    for {
    select {
    case b = <-w.ch:
    w.buf = append(w.buf, b)
    if len(w.buf) >= bufferSize {
    w.send()
    }
    case <-timeout.C:
    w.send()
    }
    }
    }

    View full-size slide

  38. Tracing: 33eda743-f124-435c-71fc-3c872bbc98e6
    !
    2014-09-07 02:20:19.867 [/] [START] → -
    2014-09-07 02:20:19.867 [eu-west-1a/ip-10-11-3-51] [REQ] com.hailocab.hailo-2-api → com.hailocab.api.v1.customer.neardrivers -
    2014-09-07 02:20:19.867 [eu-west-1a/ip-10-11-2-203] [IN] com.hailocab.hailo-2-api → com.hailocab.api.v1.customer.neardrivers -
    2014-09-07 02:20:19.868 [eu-west-1a/ip-10-11-2-203] [REQ] com.hailocab.api.v1.customer → com.hailocab.service.feature-flags.features -
    2014-09-07 02:20:19.869 [eu-west-1a/ip-10-11-3-111] [IN] com.hailocab.api.v1.customer → com.hailocab.service.feature-flags.features -
    2014-09-07 02:20:19.876 [eu-west-1a/ip-10-11-3-111] [REQ] com.hailocab.service.feature-flags → com.hailocab.service.hob.list -
    2014-09-07 02:20:19.877 [eu-west-1a/ip-10-11-3-168] [IN] com.hailocab.service.hob → com.hailocab.service.config.compile -
    2014-09-07 02:20:19.877 [eu-west-1a/ip-10-11-3-111] [IN] com.hailocab.service.feature-flags → com.hailocab.service.hob.list -
    2014-09-07 02:20:19.877 [eu-west-1a/ip-10-11-3-111] [REQ] com.hailocab.service.hob → com.hailocab.service.config.compile -
    2014-09-07 02:20:19.883 [eu-west-1a/ip-10-11-3-168] [OUT] com.hailocab.service.hob → com.hailocab.service.config.compile - 5.59 ms
    2014-09-07 02:20:19.886 [eu-west-1a/ip-10-11-3-111] [REP] com.hailocab.service.hob → com.hailocab.service.config.compile - 8.40 ms
    2014-09-07 02:20:19.887 [eu-west-1a/ip-10-11-3-111] [OUT] com.hailocab.service.feature-flags → com.hailocab.service.hob.list - 9.72 ms
    2014-09-07 02:20:19.889 [eu-west-1a/ip-10-11-3-111] [REP] com.hailocab.service.feature-flags → com.hailocab.service.hob.list - 13.23 ms
    2014-09-07 02:20:19.889 [eu-west-1a/ip-10-11-3-111] [OUT] com.hailocab.api.v1.customer → com.hailocab.service.feature-flags.features - 20.58 ms
    2014-09-07 02:20:19.890 [eu-west-1a/ip-10-11-2-203] [REP] com.hailocab.api.v1.customer → com.hailocab.service.feature-flags.features - 22.59 ms
    2014-09-07 02:20:19.902 [eu-west-1a/ip-10-11-2-203] [REQ] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare -
    2014-09-07 02:20:19.903 [eu-west-1a/ip-10-11-2-203] [REQ] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare -
    2014-09-07 02:20:19.903 [eu-west-1a/ip-10-11-2-203] [REQ] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare -
    2014-09-07 02:20:19.904 [eu-west-1a/ip-10-11-3-111] [IN] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare -
    2014-09-07 02:20:19.904 [eu-west-1a/ip-10-11-3-111] [OUT] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare - 0.36 ms
    2014-09-07 02:20:19.905 [eu-west-1a/ip-10-11-2-203] [REP] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare - 1.97 ms
    2014-09-07 02:20:19.905 [eu-west-1a/ip-10-11-2-214] [IN] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare -
    2014-09-07 02:20:19.905 [eu-west-1a/ip-10-11-2-203] [REQ] com.hailocab.api.v1.customer → com.hailocab.service.nearest-driver.search -
    2014-09-07 02:20:19.905 [eu-west-1a/ip-10-11-2-214] [OUT] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare - 0.10 ms

    ERR - com.hailocab.service.fare.basefare: Missing config at xxx
    2014-09-07 02:20:19.906 [eu-west-1a/ip-10-11-2-214] [IN] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare -
    2014-09-07 02:20:19.906 [eu-west-1a/ip-10-11-2-214] [OUT] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare - 0.06 ms 

    ERR - com.hailocab.service.fare.basefare: Missing config at xxx
    2014-09-07 02:20:19.907 [eu-west-1a/ip-10-11-3-58] [IN] com.hailocab.api.v1.customer → com.hailocab.service.nearest-driver.search -
    2014-09-07 02:20:19.907 [eu-west-1a/ip-10-11-3-58] [REQ] com.hailocab.service.nearest-driver → com.hailocab.service.zoning.search -
    2014-09-07 02:20:19.908 [eu-west-1a/ip-10-11-3-58] [IN] com.hailocab.service.nearest-driver → com.hailocab.service.zoning.search -
    2014-09-07 02:20:19.908 [eu-west-1a/ip-10-11-3-58] [OUT] com.hailocab.service.nearest-driver → com.hailocab.service.zoning.search - 0.20 ms
    2014-09-07 02:20:19.909 [eu-west-1a/ip-10-11-3-58] [REP] com.hailocab.service.nearest-driver → com.hailocab.service.zoning.search - 2.25 ms
    2014-09-07 02:20:19.909 [eu-west-1a/ip-10-11-3-58] [REQ] com.hailocab.service.nearest-driver → com.hailocab.service.raziel.multisearch -
    2014-09-07 02:20:19.912 [eu-west-1a/ip-10-11-3-227] [IN] com.hailocab.service.nearest-driver → com.hailocab.service.raziel.multisearch -
    2014-09-07 02:20:19.919 [eu-west-1a/ip-10-11-3-58] [REP] com.hailocab.service.nearest-driver → com.hailocab.service.raziel.multisearch - 9.46 ms
    2014-09-07 02:20:19.919 [eu-west-1a/ip-10-11-3-58] [REQ] com.hailocab.service.nearest-driver → com.hailocab.service.eta.multitraveltime -
    2014-09-07 02:20:19.919 [eu-west-1a/ip-10-11-3-227] [OUT] com.hailocab.service.nearest-driver → com.hailocab.service.raziel.multisearch - 7.58 ms
    2014-09-07 02:20:19.920 [eu-west-1a/ip-10-11-3-58] [IN] com.hailocab.service.nearest-driver → com.hailocab.service.eta.multitraveltime -
    2014-09-07 02:20:19.920 [eu-west-1a/ip-10-11-3-58] [OUT] com.hailocab.service.nearest-driver → com.hailocab.service.eta.multitraveltime - 0.06 ms
    2014-09-07 02:20:19.921 [eu-west-1a/ip-10-11-3-58] [REP] com.hailocab.service.nearest-driver → com.hailocab.service.eta.multitraveltime - 1.77 ms
    2014-09-07 02:20:19.921 [eu-west-1a/ip-10-11-3-58] [OUT] com.hailocab.api.v1.customer → com.hailocab.service.nearest-driver.search - 14.02 ms
    2014-09-07 02:20:19.921 [eu-west-1a/ip-10-11-2-203] [REP] com.hailocab.api.v1.customer → com.hailocab.service.nearest-driver.search - 15.48 ms
    2014-09-07 02:20:19.941 [eu-west-1a/ip-10-11-2-203] [REQ] com.hailocab.api.v1.customer → com.hailocab.service.experiment.readlastupdated -
    2014-09-07 02:20:19.945 [eu-west-1a/ip-10-11-2-214] [IN] com.hailocab.api.v1.customer → com.hailocab.service.experiment.readlastupdated -
    2014-09-07 02:20:19.947 [eu-west-1a/ip-10-11-2-214] [OUT] com.hailocab.api.v1.customer → com.hailocab.service.experiment.readlastupdated - 1.82 ms
    2014-09-07 02:20:19.947 [eu-west-1a/ip-10-11-2-203] [REP] com.hailocab.api.v1.customer → com.hailocab.service.experiment.readlastupdated - 6.01 ms
    2014-09-07 02:20:19.948 [eu-west-1a/ip-10-11-2-203] [OUT] com.hailocab.hailo-2-api → com.hailocab.api.v1.customer.neardrivers - 80.46 ms
    2014-09-07 02:20:19.950 [eu-west-1a/ip-10-11-3-51] [REP] com.hailocab.hailo-2-api → com.hailocab.api.v1.customer.neardrivers - 82.71 ms

    View full-size slide

  39. SMALL, SIMPLE
    EASY TO READ & LEARN
    CONCURRENCY
    DEPLOYMENT

    View full-size slide

  40. STILL LEARNING
    3RD PARTY LIBRARY SUPPORT
    PACKAGE MANAGEMENT

    View full-size slide

  41. TOOLING, TESTING,
    AUTOMATION,
    SCHEDULING, DOCKER,
    KUBERNETES

    View full-size slide

  42. THANKS!
    PS. We’re hiring!
    Go London User Group, Sep 2014

    View full-size slide

  43. IMAGE CREDITS
    HMS President: Chris Wainwright
    Clouds: sweetydarkdream.deviantart.com
    Orbital Ion Cannon: www.rom.ac
    Go Gophers: Renee French
    !

    View full-size slide