Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Hedged requests in Go

Oleg Kovalov
September 20, 2023

Hedged requests in Go

Oleg Kovalov

September 20, 2023
Tweet

More Decks by Oleg Kovalov

Other Decks in Technology

Transcript

  1. - Open-source addicted gopher - Fan of linters (co-author go-critic)

    - Father of a labrador - go-perf.dev - ... olegk.dev Me 6
  2. - Deploy - Packet drop - Cache eviction - Queue

    overflow - Connection pool - Background jobs - Garbage collection - Windows Update - … Variability 17
  3. - Paper published by Googlers - Jeffrey Dean, Luiz André

    Barroso - 2013 - More than just a hedged requests - See: https://www.barroso.org/publications/TheTailAtScale.pdf Tail at scale 24
  4. - Paper published by Googlers - Jeffrey Dean, Luiz André

    Barroso - 2013 - More than just a hedged requests - See: https://www.barroso.org/publications/TheTailAtScale.pdf You are not Google https://blog.bradfieldcs.com/you-are-not-google-84912cf44afb Tail at scale 25
  5. - Paper published by Googlers - Jeffrey Dean, Luiz André

    Barroso - 2013 - More than just a hedged requests - See: https://www.barroso.org/publications/TheTailAtScale.pdf You are not Google https://blog.bradfieldcs.com/you-are-not-google-84912cf44afb Tail at scale != Tailscale Tail at scale 26
  6. - Github was empty! - Well, there were a few

    proof of concept Existing implementations 35
  7. - Github was empty! - Well, there were a few

    proof of concept - But nothing production ready - at least what I was ready to pick up Existing implementations 36
  8. - Dependency-free - Perfectly aligns with net/http - Optimized for

    speed - Battle-tested cristalhq/hedgedhttp 41
  9. Example 42 delay := 10 * time.Millisecond upto := 7

    client := &http.Client{Timeout: time.Second} hedged, err := hedgedhttp.NewClient(delay, upto, client) if err != nil { panic(err) } // will take `upto` requests, with a `delay` between them resp, err := hedged.Do(req) if err != nil { panic(err) } defer resp.Body.Close()
  10. Example 43 delay := 10 * time.Millisecond upto := 7

    client := &http.Client{Timeout: time.Second} hedged, err := hedgedhttp.NewClient(delay, upto, client) if err != nil { panic(err) } // will take `upto` requests, with a `delay` between them resp, err := hedged.Do(req) if err != nil { panic(err) } defer resp.Body.Close()
  11. Example 44 delay := 10 * time.Millisecond upto := 7

    client := &http.Client{Timeout: time.Second} hedged, err := hedgedhttp.NewClient(delay, upto, client) if err != nil { panic(err) } // will take `upto` requests, with a `delay` between them resp, err := hedged.Do(req) if err != nil { panic(err) } defer resp.Body.Close()
  12. Example 45 delay := 10 * time.Millisecond upto := 7

    client := &http.Client{Timeout: time.Second} hedged, err := hedgedhttp.NewClient(delay, upto, client) if err != nil { panic(err) } // will take `upto` requests, with a `delay` between them resp, err := hedged.Do(req) if err != nil { panic(err) } defer resp.Body.Close()
  13. Modern example 46 cfg := hedgedhttp.Config{ Transport: http.DefaultTransport, Upto: 3,

    Delay: 10 * time.Millisecond, Next: func() (upto int, delay time.Duration) { return magic() }, } client, err := hedgedhttp.New(cfg) if err != nil { panic(err) } resp, err := client.Do(req) if err != nil { panic(err) } defer resp.Body.Close()
  14. Modern example 47 cfg := hedgedhttp.Config{ Transport: http.DefaultTransport, Upto: 3,

    Delay: 10 * time.Millisecond, Next: func() (upto int, delay time.Duration) { return magic() }, } client, err := hedgedhttp.New(cfg) if err != nil { panic(err) } resp, err := client.Do(req) if err != nil { panic(err) } defer resp.Body.Close()
  15. Modern example 48 cfg := hedgedhttp.Config{ Transport: http.DefaultTransport, Upto: 3,

    Delay: 10 * time.Millisecond, Next: func() (upto int, delay time.Duration) { return magic() }, } client, err := hedgedhttp.New(cfg) if err != nil { panic(err) } resp, err := client.Do(req) if err != nil { panic(err) } defer resp.Body.Close()
  16. hedgedhttp internals (1) 49 func (ht *hedgedTransport) RoundTrip(req *http.Request) (*http.Response,

    error) { mainCtx := req.Context() upto, timeout := ht.upto, ht.timeout if ht.next != nil { upto, timeout = ht.next() } // no hedged requests, just a regular one. if upto <= 0 { return ht.rt.RoundTrip(req) } // rollback to default timeout. if timeout < 0 { timeout = ht.timeout }
  17. hedgedhttp internals (2) 50 for sent := 0; len(errOverall.Errors) <

    upto; sent++ { if sent < upto { idx := sent subReq, cancel := reqWithCtx(req, mainCtx, idx != 0) cancels[idx] = cancel runInPool(func() { resp, err := ht.rt.RoundTrip(subReq) if err != nil { ht.metrics.failedRoundTripsInc() errorCh <- err } else { resultCh <- indexedResp{idx, resp} } }) }
  18. hedgedhttp internals (3) 51 // all request sent - effectively

    disabling timeout between requests if sent == upto { timeout = infiniteTimeout } resp, err := waitResult(mainCtx, resultCh, errorCh, timeout) switch { case resp.Resp != nil: resultIdx = resp.Index return resp.Resp, nil case mainCtx.Err() != nil: ht.metrics.canceledByUserRoundTripsInc() return nil, mainCtx.Err() case err != nil: errOverall.Errors = append(errOverall.Errors, err) }
  19. hedgedhttp internals (4) 52 func waitResult(ctx context.Context, resultCh <-chan indexedResp,

    errorCh <-chan error, timeout time.Duration) (indexedResp, error) { // ... select { select { case res := <-resultCh: return res, nil case reqErr := <-errorCh: return indexedResp{}, reqErr case <-ctx.Done(): return indexedResp{}, ctx.Err() case <-timer.C: return indexedResp{}, nil // timeout BETWEEN consecutive requests } } }
  20. Cute goroutine pool var taskQueue = make(chan func()) func runInPool(task

    func()) { select { case taskQueue <- task: // submitted, everything is ok default: go func() { task() cleanupTicker := time.NewTicker(cleanupDuration) for { select { case t := <-taskQueue: t(); cleanupTicker.Reset(cleanupDuration) case <-cleanupTicker.C: cleanupTicker.Stop(); return } } }() } } 53
  21. - Same pattern works for gRPC - and any other

    RPC (if your API is sane) Bonus 56
  22. - Same pattern works for gRPC - and any other

    RPC (if your API is sane) https://github.com/cristalhq/hedgedgrpc (waits for your ⭐) Bonus 57
  23. > For example, in a Google benchmark that reads the

    values for 1,000 keys stored in a BigTable table distributed across 100 different servers, sending a hedging request after a 10ms delay reduces the 99.9th-percentile latency for retrieving all 1,000 values from 1,800ms to 74ms while sending just 2% more requests. Google BigTable bench 60 https://www.barroso.org/publications/TheTailAtScale.pdf
  24. > For example, in a Google benchmark that reads the

    values for 1,000 keys stored in a BigTable table distributed across 100 different servers, sending a hedging request after a 10ms delay reduces the 99.9th-percentile latency for retrieving all 1,000 values from 1,800ms to 74ms while sending just 2% more requests. Google BigTable bench 61 https://www.barroso.org/publications/TheTailAtScale.pdf
  25. - Reinventing a wheel - We should have better things

    - Everyone is welcome! https://github.com/go-distsys go-distsys 72
  26. - Go, performance, optimizations, concurrency - Have something to present?

    - See https://go-perf.dev/go-perf-meetup-1 Go-perf meetup #1 75
  27. - Latency is hard but manageable - Read papers -

    Do good stuff - No silver bullet (sad but true) - go-distsys & go-perf 🚀 Conclusions 79
  28. - The Tail at Scale - https://www.barroso.org/publications/TheTailAtScale.pdf - cristalhq/hedgedhttp (⭐

    and “go get”) - https://github.com/cristalhq/hedgedhttp - go-distsys (click “Follow” & submit ideas!) - https://github.com/go-distsys References 80