Upgrade to Pro — share decks privately, control downloads, hide ads and more …

You're Good to Go.

mattheath
September 09, 2014

You're Good to Go.

As Hailo expanded from a European to a Global business, we needed to re-evaluate our approach to technology. Moving away from a traditional monolithic PHP/Java application, and embracing a cloud native approach, Hailo transitioned to a new micro-service platform built almost entirely in Go.

mattheath

September 09, 2014
Tweet

More Decks by mattheath

Other Decks in Programming

Transcript

  1. YOU’RE GOOD TO GO.
    Matt Heath - Technical Lead, Platform
    @mattheath
    Go London User Group, Sep 2014

    View Slide

  2. View Slide

  3. FROM BARCELONA TO BOSTON,
    FROM TOKYO TO TORONTO.

    View Slide

  4. View Slide

  5. eu-west-1
    PHP
    Driver

    API
    Load Balancer Load Balancer
    PHP

    Cust

    API
    PHP
    Driver

    API
    Java

    Hailo
    Engine
    PHP
    Driver

    API
    PHP

    Cust

    API
    PHP

    Cust

    API
    MySQL
    Redis
    H1
    MySQL

    View Slide

  6. MySQL
    PHP
    Driver

    API
    Java

    Hailo
    Engine
    ELB
    MySQL
    PHP
    Driver

    API
    Java

    Hailo
    Engine
    ELB
    MySQL
    PHP
    Driver

    API
    Java

    Hailo
    Engine
    ELB
    eu-west-1 eu-west-1
    us-east-1
    PHP

    Cust

    API
    PHP

    Cust

    API
    PHP

    Cust

    API
    PHP

    Cust

    Service
    PHP

    Credits

    Service
    Java

    Pay
    Service
    ELB
    C* C* C*
    H1.5

    View Slide

  7. CHALLENGES

    View Slide

  8. DEVELOPING NEW FEATURES
    UNCLEAR RESPONSIBILITIES
    SPOFS WITH SLOW FAILOVER
    LACK OF AUTOMATION

    View Slide

  9. View Slide

  10. “construct a highly agile and
    highly available service from
    ephemeral and assumed broken
    components”
    Adrian Cockcroſt

    View Slide

  11. Go

    Service
    Go

    Service
    Java
    Service
    ELB
    C* C* C*
    Go “Thin” API
    RabbitMQ Message Bus

    (federated clusters per AZ)
    us-east-1 eu-west-1
    H2
    Go

    Service
    Go

    Service
    Java
    Service
    ELB
    C* C* C*
    Go “Thin” API
    RabbitMQ Message Bus

    (federated clusters per AZ)

    View Slide

  12. View Slide

  13. Logic
    go-service-layer
    Storage
    go-platform-layer
    Handler
    Library for building services that talk Protobuf via RMQ
    Self-configuring external service adapters
    Services get for free:
    !
    • Provisioning
    • Service discovery
    • Configuration
    • Monitoring
    • Authentication/authorisation
    • AB testing
    • Self configuring connectivity 

    to third-party services
    H2
    Service

    View Slide

  14. H2

    View Slide

  15. View Slide

  16. View Slide

  17. Provisioning Service Provisioning Service Provisioning Service
    CI Pipeline (Janky/Jenkins) Amazon S3
    Rabbit MQ
    Provisioning Manager

    View Slide

  18. Provisioning Service
    Rabbit MQ
    Discovery
    Service
    Binding

    Service
    Provisioning Service
    New

    Service
    AUTOMATIC SERVICE DISCOVERY

    View Slide

  19. SMALL INDEPENDENT SERVICES
    SINGLE RESPONSIBILITY
    EASE PAIN, SCALE RAPIDLY

    View Slide

  20. CLOUD NATIVE / ANTIFRAGILE
    EXPECT FAILURE
    AUTOMATE EVERYTHING

    View Slide

  21. POINT IN POLYGONS
    IN-MEMORY CACHE, ATOMIC RCU UPDATES
    NON BLOCKING
    <200µs SEARCH TIME, 4ms ROUND TRIP
    ZONING SERVICE

    View Slide

  22. UNIQUE ID GENERATION
    77669839702851584
    41 bits Timestamp, ms precision, bespoke epoch
    10 bits Configured machine ID
    12 bits Sequence number

    View Slide

  23. RAZIEL

    View Slide

  24. DISTRIBUTED, HA, GEOSPATIAL SEARCH ENGINE
    INGESTS GPS UPDATES - HIGH WRITE VOLUME
    LOW LATENCY READS / SEARCHES (<5ms)
    RAZIEL

    View Slide

  25. RAZIEL

    View Slide

  26. INFRASTRUCTURE

    View Slide

  27. AUTOMATION
    GO SERVICES
    GOAMZ

    View Slide

  28. Environment
    Service
    “Meta” environment
    Compute
    “Test” environment
    “Staging” environment
    “Boop” environment
    Whisper
    Service
    Elastic IP
    Allocation
    Autoscaling
    DNS
    Configuration
    Updating
    Security
    Groups
    SNS

    View Slide

  29. TESTING

    View Slide

  30. !
    LOAD
    FAILURE
    DEGRADATION

    View Slide

  31. 15,000 JOBS/HOUR
    20,000 DRIVERS
    4,000+ REQ/S

    View Slide

  32. CONTINUOUS
    PRODUCTION
    TESTING

    View Slide

  33. TOOLING

    View Slide

  34. DISTRIBUTED

    TRACING

    View Slide

  35. hailo~2~api api.v1.customer service.customer
    hailo~2~api api.v1.customer service.customer

    View Slide

  36. hailo~2~api api.v1.customer service.customer
    hailo~2~api api.v1.customer service.customer
    REQ
    REP
    REQ
    REP
    IN
    OUT
    IN
    OUT

    View Slide

  37. {
    "timestamp": 1410262798427145176,
    "traceId": "d30479b8-1491-4390-7cf5-4cd14bc4b765",
    "type": "OUT",
    "messageId": "a661f9ef-774c-49b2-6e74-cfed65f7d120",
    "parentMessageId": "",
    "from": "com.hailocab.hshell",
    "to": "com.hailocab.service.nearest-driver.search",
    "hostname": "ip-10-13-2-251",
    "az": "eu-west-1a",
    "handlerInstanceId": “server-com.hailocab.service.nearest-driver-18bd089e-8ef1-4ca1-75cb-8...c”,
    "duration": 11222094
    }
    {
    "timestamp": 1410262798416053450,
    "traceId": "d30479b8-1491-4390-7cf5-4cd14bc4b765",
    "type": "REQ",
    "messageId": "6404dd1e-c995-48a9-73dc-9edb1380f0bf",
    "parentMessageId": "a661f9ef-774c-49b2-6e74-cfed65f7d120",
    "from": "com.hailocab.service.nearest-driver",
    "to": "com.hailocab.service.zoning.search",
    "hostname": "ip-10-13-2-251",
    "az": "eu-west-1a"
    }

    View Slide

  38. hailo~2~api api.v1.customer service.customer
    REQ
    RSP
    REQ
    RSP
    IN
    OUT
    IN
    OUT
    NSQ

    View Slide

  39. // Req sends a request, and...
    func (c *Client) Req(req *Request, rsp proto.Message,
    options ...Options) errors.Error {
    !
    go c.traceReq(req)
    responseMsg, err := c.doReq(req, options...)
    go c.traceRsp(req, responseMsg, err)
    if err != nil {
    return err
    }
    !
    // Do other things...
    !
    return nil
    }

    View Slide

  40. // instrumentedHandler wraps the handler to provide instrumentation
    func (ep *Endpoint) instrumentedHandler(req *Request)
    (proto.Message, errors.Error) {
    !
    start := time.Now()
    var err errors.Error
    var msg proto.Message
    !
    // Defer panic handling
    defer func() {
    stats.Record(ep, err, time.Since(start))
    // oh crap i hope this never happens
    }()
    !
    // Execute handler
    go traceIn(req)
    msg, err = ep.Handler(req)
    go traceOut(req, msg, err, time.Since(start))
    return msg, err
    }

    View Slide

  41. // instrumentedHandler wraps the handler to provide instrumentation
    func (ep *Endpoint) instrumentedHandler(req *Request)
    (proto.Message, errors.Error) {
    !
    start := time.Now()
    var err errors.Error
    var msg proto.Message
    !
    // Defer panic handling
    defer func() {
    stats.Record(ep, err, time.Since(start))
    // oh crap i hope this never happens
    }()
    !
    // Execute handler
    go traceIn(req) // Don’t actually do this
    msg, err = ep.Handler(req)
    go traceOut(req, msg, err, time.Since(start))
    return msg, err
    }

    View Slide

  42. Phosphor
    Host Instances
    Publish
    Service A
    Trace Library
    goroutine
    chan
    UDP
    Service B
    Trace Library
    goroutine
    chan
    UDP
    Trace
    Service
    In-memory
    Aggregates
    Optional

    persistant
    storage
    Dashboards
    Monitoring

    View Slide

  43. Phosphor
    Host Instances
    Publish
    Service A
    Trace Library
    goroutine
    chan
    UDP
    Service B
    Trace Library
    goroutine
    chan
    UDP
    Trace
    Service
    In-memory
    Aggregates
    Optional

    persistant
    storage
    Dashboards
    Monitoring

    View Slide

  44. var traceChan chan []byte
    !
    func init() {
    // Use a buffered channel
    traceChan = make(chan []byte, 200)
    !
    // Fire off a background worker for this channel
    defaultClient = NewClient(traceChan)
    go defaultClient.publisher()
    }
    !
    // Send, drops trace if the backend is at capacity
    func Send(msg []byte) {
    select {
    case traceChan <- msg:
    // Success
    default:
    // Default case fired if channel is full
    // Ensures this is non blocking
    }
    }

    View Slide

  45. Phosphor
    Trace
    Service
    Host Instances
    Publish
    Service A
    Trace Library
    goroutine
    chan
    UDP
    Service B
    Trace Library
    goroutine
    chan
    UDP
    In-memory
    Aggregates
    Optional

    persistant
    storage
    Dashboards
    Monitoring

    View Slide

  46. func (w *worker) loop() {
    var b []byte
    !
    // Spin and forward on traces every time our
    // buffer fills, or when our time window elapses
    for {
    select {
    case b = <-w.ch:
    w.buf = append(w.buf, b)
    if len(w.buf) >= bufferSize {
    w.send()
    }
    case <-time.Tick(bufferWindow):
    w.send()
    }
    }
    }

    View Slide

  47. func (w *worker) loop() {
    var b []byte
    !
    // Spin and forward on traces every time our
    // buffer fills, or when our time window elapses
    for {
    select {
    case b = <-w.ch:
    w.buf = append(w.buf, b)
    if len(w.buf) >= bufferSize {
    w.send()
    }
    case <-time.Tick(bufferWindow): // Leaks goroutines
    w.send()
    }
    }
    }

    View Slide

  48. func (w *worker) loop() {
    var b []byte
    timeout := time.Tick(bufferWindow)
    !
    // Spin and forward on traces every time our
    // buffer fills, or when our time window elapses
    for {
    select {
    case b = <-w.ch:
    w.buf = append(w.buf, b)
    if len(w.buf) >= bufferSize {
    w.send()
    }
    case <-timeout:
    w.send()
    }
    }
    }

    View Slide

  49. func (w *worker) loop() {
    var b []byte
    timeout := time.NewTicker(bufferWindow)
    defer timeout.Stop() // Bonus points
    !
    // Spin and forward on traces every time our
    // buffer fills, or when our time window elapses
    for {
    select {
    case b = <-w.ch:
    w.buf = append(w.buf, b)
    if len(w.buf) >= bufferSize {
    w.send()
    }
    case <-timeout.C:
    w.send()
    }
    }
    }

    View Slide

  50. Tracing: 33eda743-f124-435c-71fc-3c872bbc98e6
    !
    2014-09-07 02:20:19.867 [/] [START] → -
    2014-09-07 02:20:19.867 [eu-west-1a/ip-10-11-3-51] [REQ] com.hailocab.hailo-2-api → com.hailocab.api.v1.customer.neardrivers -
    2014-09-07 02:20:19.867 [eu-west-1a/ip-10-11-2-203] [IN] com.hailocab.hailo-2-api → com.hailocab.api.v1.customer.neardrivers -
    2014-09-07 02:20:19.868 [eu-west-1a/ip-10-11-2-203] [REQ] com.hailocab.api.v1.customer → com.hailocab.service.feature-flags.features -
    2014-09-07 02:20:19.869 [eu-west-1a/ip-10-11-3-111] [IN] com.hailocab.api.v1.customer → com.hailocab.service.feature-flags.features -
    2014-09-07 02:20:19.876 [eu-west-1a/ip-10-11-3-111] [REQ] com.hailocab.service.feature-flags → com.hailocab.service.hob.list -
    2014-09-07 02:20:19.877 [eu-west-1a/ip-10-11-3-168] [IN] com.hailocab.service.hob → com.hailocab.service.config.compile -
    2014-09-07 02:20:19.877 [eu-west-1a/ip-10-11-3-111] [IN] com.hailocab.service.feature-flags → com.hailocab.service.hob.list -
    2014-09-07 02:20:19.877 [eu-west-1a/ip-10-11-3-111] [REQ] com.hailocab.service.hob → com.hailocab.service.config.compile -
    2014-09-07 02:20:19.883 [eu-west-1a/ip-10-11-3-168] [OUT] com.hailocab.service.hob → com.hailocab.service.config.compile - 5.59 ms
    2014-09-07 02:20:19.886 [eu-west-1a/ip-10-11-3-111] [REP] com.hailocab.service.hob → com.hailocab.service.config.compile - 8.40 ms
    2014-09-07 02:20:19.887 [eu-west-1a/ip-10-11-3-111] [OUT] com.hailocab.service.feature-flags → com.hailocab.service.hob.list - 9.72 ms
    2014-09-07 02:20:19.889 [eu-west-1a/ip-10-11-3-111] [REP] com.hailocab.service.feature-flags → com.hailocab.service.hob.list - 13.23 ms
    2014-09-07 02:20:19.889 [eu-west-1a/ip-10-11-3-111] [OUT] com.hailocab.api.v1.customer → com.hailocab.service.feature-flags.features - 20.58 ms
    2014-09-07 02:20:19.890 [eu-west-1a/ip-10-11-2-203] [REP] com.hailocab.api.v1.customer → com.hailocab.service.feature-flags.features - 22.59 ms
    2014-09-07 02:20:19.902 [eu-west-1a/ip-10-11-2-203] [REQ] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare -
    2014-09-07 02:20:19.903 [eu-west-1a/ip-10-11-2-203] [REQ] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare -
    2014-09-07 02:20:19.903 [eu-west-1a/ip-10-11-2-203] [REQ] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare -
    2014-09-07 02:20:19.904 [eu-west-1a/ip-10-11-3-111] [IN] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare -
    2014-09-07 02:20:19.904 [eu-west-1a/ip-10-11-3-111] [OUT] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare - 0.36 ms
    2014-09-07 02:20:19.905 [eu-west-1a/ip-10-11-2-203] [REP] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare - 1.97 ms
    2014-09-07 02:20:19.905 [eu-west-1a/ip-10-11-2-214] [IN] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare -
    2014-09-07 02:20:19.905 [eu-west-1a/ip-10-11-2-203] [REQ] com.hailocab.api.v1.customer → com.hailocab.service.nearest-driver.search -
    2014-09-07 02:20:19.905 [eu-west-1a/ip-10-11-2-214] [OUT] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare - 0.10 ms

    ERR - com.hailocab.service.fare.basefare: Missing config at xxx
    2014-09-07 02:20:19.906 [eu-west-1a/ip-10-11-2-214] [IN] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare -
    2014-09-07 02:20:19.906 [eu-west-1a/ip-10-11-2-214] [OUT] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare - 0.06 ms 

    ERR - com.hailocab.service.fare.basefare: Missing config at xxx
    2014-09-07 02:20:19.907 [eu-west-1a/ip-10-11-3-58] [IN] com.hailocab.api.v1.customer → com.hailocab.service.nearest-driver.search -
    2014-09-07 02:20:19.907 [eu-west-1a/ip-10-11-3-58] [REQ] com.hailocab.service.nearest-driver → com.hailocab.service.zoning.search -
    2014-09-07 02:20:19.908 [eu-west-1a/ip-10-11-3-58] [IN] com.hailocab.service.nearest-driver → com.hailocab.service.zoning.search -
    2014-09-07 02:20:19.908 [eu-west-1a/ip-10-11-3-58] [OUT] com.hailocab.service.nearest-driver → com.hailocab.service.zoning.search - 0.20 ms
    2014-09-07 02:20:19.909 [eu-west-1a/ip-10-11-3-58] [REP] com.hailocab.service.nearest-driver → com.hailocab.service.zoning.search - 2.25 ms
    2014-09-07 02:20:19.909 [eu-west-1a/ip-10-11-3-58] [REQ] com.hailocab.service.nearest-driver → com.hailocab.service.raziel.multisearch -
    2014-09-07 02:20:19.912 [eu-west-1a/ip-10-11-3-227] [IN] com.hailocab.service.nearest-driver → com.hailocab.service.raziel.multisearch -
    2014-09-07 02:20:19.919 [eu-west-1a/ip-10-11-3-58] [REP] com.hailocab.service.nearest-driver → com.hailocab.service.raziel.multisearch - 9.46 ms
    2014-09-07 02:20:19.919 [eu-west-1a/ip-10-11-3-58] [REQ] com.hailocab.service.nearest-driver → com.hailocab.service.eta.multitraveltime -
    2014-09-07 02:20:19.919 [eu-west-1a/ip-10-11-3-227] [OUT] com.hailocab.service.nearest-driver → com.hailocab.service.raziel.multisearch - 7.58 ms
    2014-09-07 02:20:19.920 [eu-west-1a/ip-10-11-3-58] [IN] com.hailocab.service.nearest-driver → com.hailocab.service.eta.multitraveltime -
    2014-09-07 02:20:19.920 [eu-west-1a/ip-10-11-3-58] [OUT] com.hailocab.service.nearest-driver → com.hailocab.service.eta.multitraveltime - 0.06 ms
    2014-09-07 02:20:19.921 [eu-west-1a/ip-10-11-3-58] [REP] com.hailocab.service.nearest-driver → com.hailocab.service.eta.multitraveltime - 1.77 ms
    2014-09-07 02:20:19.921 [eu-west-1a/ip-10-11-3-58] [OUT] com.hailocab.api.v1.customer → com.hailocab.service.nearest-driver.search - 14.02 ms
    2014-09-07 02:20:19.921 [eu-west-1a/ip-10-11-2-203] [REP] com.hailocab.api.v1.customer → com.hailocab.service.nearest-driver.search - 15.48 ms
    2014-09-07 02:20:19.941 [eu-west-1a/ip-10-11-2-203] [REQ] com.hailocab.api.v1.customer → com.hailocab.service.experiment.readlastupdated -
    2014-09-07 02:20:19.945 [eu-west-1a/ip-10-11-2-214] [IN] com.hailocab.api.v1.customer → com.hailocab.service.experiment.readlastupdated -
    2014-09-07 02:20:19.947 [eu-west-1a/ip-10-11-2-214] [OUT] com.hailocab.api.v1.customer → com.hailocab.service.experiment.readlastupdated - 1.82 ms
    2014-09-07 02:20:19.947 [eu-west-1a/ip-10-11-2-203] [REP] com.hailocab.api.v1.customer → com.hailocab.service.experiment.readlastupdated - 6.01 ms
    2014-09-07 02:20:19.948 [eu-west-1a/ip-10-11-2-203] [OUT] com.hailocab.hailo-2-api → com.hailocab.api.v1.customer.neardrivers - 80.46 ms
    2014-09-07 02:20:19.950 [eu-west-1a/ip-10-11-3-51] [REP] com.hailocab.hailo-2-api → com.hailocab.api.v1.customer.neardrivers - 82.71 ms

    View Slide

  51. View Slide

  52. View Slide

  53. View Slide

  54. View Slide

  55. SO, GO?

    View Slide

  56. SMALL, SIMPLE
    EASY TO READ & LEARN
    CONCURRENCY
    DEPLOYMENT

    View Slide

  57. STILL LEARNING
    3RD PARTY LIBRARY SUPPORT
    PACKAGE MANAGEMENT

    View Slide

  58. NEXT?

    View Slide

  59. TOOLING, TESTING,
    AUTOMATION,
    SCHEDULING, DOCKER,
    KUBERNETES

    View Slide

  60. View Slide

  61. THANKS!
    PS. We’re hiring!
    Go London User Group, Sep 2014

    View Slide

  62. IMAGE CREDITS
    HMS President: Chris Wainwright
    Clouds: sweetydarkdream.deviantart.com
    Orbital Ion Cannon: www.rom.ac
    Go Gophers: Renee French
    !

    View Slide