Pro Yearly is on sale from $80 to $50! »

Observability in micro-service architectures

67f4a8f2a209a38d7242829947b26ba3?s=47 mattheath
October 10, 2014

Observability in micro-service architectures

Lightning talk at dotGo 2014.
Video available here: https://www.youtube.com/watch?v=VoFfhEbcEG0

Introduces how we implement distributed request tracing at Hailo within our platform composed from micro-services built almost entirely in Go.

Using Go's Channels and Selects allows us to forward traces when possible, but not at the expense of serving requests to our end users.

67f4a8f2a209a38d7242829947b26ba3?s=128

mattheath

October 10, 2014
Tweet

Transcript

  1. OBSERVABILITY IN MICROSERVICE ARCHITECTURES Matt Heath - Technical Lead, Platform

    @mattheath dotGo, Oct 2014
  2. None
  3. None
  4. Go
 Service Go
 Service Java Service Load Balancer C* C*

    C* API / Routing RabbitMQ Message Bus
 (federated clusters per AZ) us-east-1 eu-west-1 Go
 Service Go
 Service Java Service Load Balancer C* C* C* API / Routing RabbitMQ Message Bus
 (federated clusters per AZ)
  5. hailo~2~api api.v1.customer service.customer hailo~2~api api.v1.customer service.customer

  6. hailo~2~api api.v1.customer service.customer hailo~2~api api.v1.customer service.customer REQ REP REQ REP

    IN OUT IN OUT
  7. // Req sends a request, and... func (c *Client) Req(req

    *Request, rsp proto.Message, options ...Options) errors.Error { ! go c.traceReq(req) responseMsg, err := c.doReq(req, options...) go c.traceRsp(req, responseMsg, err) if err != nil { return err } ! // Do other things... ! return nil }
  8. // instrumentedHandler wraps the handler providing instrumentation func (ep *Endpoint)

    instrumentedHandler(req *Request) (proto.Message, errors.Error) { ! start := time.Now() var err errors.Error var msg proto.Message ! // Defer panic handling defer func() { stats.Record(ep, err, time.Since(start)) // oh crap i hope this never happens }() ! // Execute handler go traceIn(req) msg, err = ep.Handler(req) go traceOut(req, msg, err, time.Since(start)) return msg, err }
  9. hailo~2~api api.v1.customer service.customer REQ RSP REQ RSP IN OUT IN

    OUT NSQ
  10. Phosphor Host Instances Publish Service A Trace Library goroutine chan

    UDP Service B Trace Library goroutine chan UDP Trace Service In-memory Aggregates Optional
 persistant storage Dashboards Monitoring
  11. Phosphor Host Instances Publish Service A Trace Library goroutine chan

    UDP Service B Trace Library goroutine chan UDP Trace Service In-memory Aggregates Optional
 persistant storage Dashboards Monitoring
  12. var traceChan chan []byte ! func init() { // Use

    a buffered channel traceChan = make(chan []byte, 200) ! // Fire off a background worker for this channel defaultClient = NewClient(traceChan) go defaultClient.publisher() } ! // Send, drops trace if the backend is at capacity func Send(msg []byte) { select { case traceChan <- msg: // Success default: // Default case fired if channel is full // Ensures non blocking } }
  13. Phosphor Trace Service Host Instances Publish Service A Trace Library

    goroutine chan UDP Service B Trace Library goroutine chan UDP In-memory Aggregates Optional
 persistant storage Dashboards Monitoring
  14. func (w *worker) loop() { var b []byte timeout :=

    time.NewTicker(bufferWindow) defer timeout.Stop() // Bonus points ! // Spin and forward on traces every time our // buffer fills, or when our time window elapses for { select { case b = <-w.ch: w.buf = append(w.buf, b) if len(w.buf) >= bufferSize { w.send() } case <-timeout.C: w.send() } } }
  15. Tracing: 33eda743-f124-435c-71fc-3c872bbc98e6 ! 2014-09-07 02:20:19.867 [/] [START] → - 2014-09-07

    02:20:19.867 [eu-west-1a/ip-10-11-3-51] [REQ] com.hailocab.hailo-2-api → com.hailocab.api.v1.customer.neardrivers - 2014-09-07 02:20:19.867 [eu-west-1a/ip-10-11-2-203] [IN] com.hailocab.hailo-2-api → com.hailocab.api.v1.customer.neardrivers - 2014-09-07 02:20:19.868 [eu-west-1a/ip-10-11-2-203] [REQ] com.hailocab.api.v1.customer → com.hailocab.service.feature-flags.features - 2014-09-07 02:20:19.869 [eu-west-1a/ip-10-11-3-111] [IN] com.hailocab.api.v1.customer → com.hailocab.service.feature-flags.features - 2014-09-07 02:20:19.876 [eu-west-1a/ip-10-11-3-111] [REQ] com.hailocab.service.feature-flags → com.hailocab.service.hob.list - 2014-09-07 02:20:19.877 [eu-west-1a/ip-10-11-3-168] [IN] com.hailocab.service.hob → com.hailocab.service.config.compile - 2014-09-07 02:20:19.877 [eu-west-1a/ip-10-11-3-111] [IN] com.hailocab.service.feature-flags → com.hailocab.service.hob.list - 2014-09-07 02:20:19.877 [eu-west-1a/ip-10-11-3-111] [REQ] com.hailocab.service.hob → com.hailocab.service.config.compile - 2014-09-07 02:20:19.883 [eu-west-1a/ip-10-11-3-168] [OUT] com.hailocab.service.hob → com.hailocab.service.config.compile - 5.59 ms 2014-09-07 02:20:19.886 [eu-west-1a/ip-10-11-3-111] [REP] com.hailocab.service.hob → com.hailocab.service.config.compile - 8.40 ms 2014-09-07 02:20:19.887 [eu-west-1a/ip-10-11-3-111] [OUT] com.hailocab.service.feature-flags → com.hailocab.service.hob.list - 9.72 ms 2014-09-07 02:20:19.889 [eu-west-1a/ip-10-11-3-111] [REP] com.hailocab.service.feature-flags → com.hailocab.service.hob.list - 13.23 ms 2014-09-07 02:20:19.889 [eu-west-1a/ip-10-11-3-111] [OUT] com.hailocab.api.v1.customer → com.hailocab.service.feature-flags.features - 20.58 ms 2014-09-07 02:20:19.890 [eu-west-1a/ip-10-11-2-203] [REP] com.hailocab.api.v1.customer → com.hailocab.service.feature-flags.features - 22.59 ms 2014-09-07 02:20:19.902 [eu-west-1a/ip-10-11-2-203] [REQ] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare - 2014-09-07 02:20:19.903 [eu-west-1a/ip-10-11-2-203] [REQ] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare - 2014-09-07 02:20:19.903 [eu-west-1a/ip-10-11-2-203] [REQ] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare - 2014-09-07 02:20:19.904 [eu-west-1a/ip-10-11-3-111] [IN] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare - 2014-09-07 02:20:19.904 [eu-west-1a/ip-10-11-3-111] [OUT] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare - 0.36 ms 2014-09-07 02:20:19.905 [eu-west-1a/ip-10-11-2-203] [REP] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare - 1.97 ms 2014-09-07 02:20:19.905 [eu-west-1a/ip-10-11-2-214] [IN] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare - 2014-09-07 02:20:19.905 [eu-west-1a/ip-10-11-2-203] [REQ] com.hailocab.api.v1.customer → com.hailocab.service.nearest-driver.search - 2014-09-07 02:20:19.905 [eu-west-1a/ip-10-11-2-214] [OUT] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare - 0.10 ms
 ERR - com.hailocab.service.fare.basefare: Missing config at xxx 2014-09-07 02:20:19.906 [eu-west-1a/ip-10-11-2-214] [IN] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare - 2014-09-07 02:20:19.906 [eu-west-1a/ip-10-11-2-214] [OUT] com.hailocab.api.v1.customer → com.hailocab.service.fare.basefare - 0.06 ms 
 ERR - com.hailocab.service.fare.basefare: Missing config at xxx 2014-09-07 02:20:19.907 [eu-west-1a/ip-10-11-3-58] [IN] com.hailocab.api.v1.customer → com.hailocab.service.nearest-driver.search - 2014-09-07 02:20:19.907 [eu-west-1a/ip-10-11-3-58] [REQ] com.hailocab.service.nearest-driver → com.hailocab.service.zoning.search - 2014-09-07 02:20:19.908 [eu-west-1a/ip-10-11-3-58] [IN] com.hailocab.service.nearest-driver → com.hailocab.service.zoning.search - 2014-09-07 02:20:19.908 [eu-west-1a/ip-10-11-3-58] [OUT] com.hailocab.service.nearest-driver → com.hailocab.service.zoning.search - 0.20 ms 2014-09-07 02:20:19.909 [eu-west-1a/ip-10-11-3-58] [REP] com.hailocab.service.nearest-driver → com.hailocab.service.zoning.search - 2.25 ms 2014-09-07 02:20:19.909 [eu-west-1a/ip-10-11-3-58] [REQ] com.hailocab.service.nearest-driver → com.hailocab.service.raziel.multisearch - 2014-09-07 02:20:19.912 [eu-west-1a/ip-10-11-3-227] [IN] com.hailocab.service.nearest-driver → com.hailocab.service.raziel.multisearch - 2014-09-07 02:20:19.919 [eu-west-1a/ip-10-11-3-58] [REP] com.hailocab.service.nearest-driver → com.hailocab.service.raziel.multisearch - 9.46 ms 2014-09-07 02:20:19.919 [eu-west-1a/ip-10-11-3-58] [REQ] com.hailocab.service.nearest-driver → com.hailocab.service.eta.multitraveltime - 2014-09-07 02:20:19.919 [eu-west-1a/ip-10-11-3-227] [OUT] com.hailocab.service.nearest-driver → com.hailocab.service.raziel.multisearch - 7.58 ms 2014-09-07 02:20:19.920 [eu-west-1a/ip-10-11-3-58] [IN] com.hailocab.service.nearest-driver → com.hailocab.service.eta.multitraveltime - 2014-09-07 02:20:19.920 [eu-west-1a/ip-10-11-3-58] [OUT] com.hailocab.service.nearest-driver → com.hailocab.service.eta.multitraveltime - 0.06 ms 2014-09-07 02:20:19.921 [eu-west-1a/ip-10-11-3-58] [REP] com.hailocab.service.nearest-driver → com.hailocab.service.eta.multitraveltime - 1.77 ms 2014-09-07 02:20:19.921 [eu-west-1a/ip-10-11-3-58] [OUT] com.hailocab.api.v1.customer → com.hailocab.service.nearest-driver.search - 14.02 ms 2014-09-07 02:20:19.921 [eu-west-1a/ip-10-11-2-203] [REP] com.hailocab.api.v1.customer → com.hailocab.service.nearest-driver.search - 15.48 ms 2014-09-07 02:20:19.941 [eu-west-1a/ip-10-11-2-203] [REQ] com.hailocab.api.v1.customer → com.hailocab.service.experiment.readlastupdated - 2014-09-07 02:20:19.945 [eu-west-1a/ip-10-11-2-214] [IN] com.hailocab.api.v1.customer → com.hailocab.service.experiment.readlastupdated - 2014-09-07 02:20:19.947 [eu-west-1a/ip-10-11-2-214] [OUT] com.hailocab.api.v1.customer → com.hailocab.service.experiment.readlastupdated - 1.82 ms 2014-09-07 02:20:19.947 [eu-west-1a/ip-10-11-2-203] [REP] com.hailocab.api.v1.customer → com.hailocab.service.experiment.readlastupdated - 6.01 ms 2014-09-07 02:20:19.948 [eu-west-1a/ip-10-11-2-203] [OUT] com.hailocab.hailo-2-api → com.hailocab.api.v1.customer.neardrivers - 80.46 ms 2014-09-07 02:20:19.950 [eu-west-1a/ip-10-11-3-51] [REP] com.hailocab.hailo-2-api → com.hailocab.api.v1.customer.neardrivers - 82.71 ms
  16. None
  17. 2013

  18. 2014

  19. 2014

  20. THANKS PS. We’re hiring! dotGo, Oct 2014