Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Once and Future Layer 5:
Twitter-style Micr...

The Once and Future Layer 5:
Twitter-style Microservices at Scale with Finagle and linkerd #🎩💪✨

Oliver Gould

June 30, 2016
Tweet

More Decks by Oliver Gould

Other Decks in Programming

Transcript

  1. The Once and Future Layer 5:
 Twitter-style Microservices at Scale

    with Finagle and linkerd #✨ oliver gould
 cto, buoyant SF Microservices @ Google SF, June 30 2016 from
  2. oliver gould • founding cto @ buoyant
 open-source microservice infrastructure

    • previously, tech lead @ twitter: • observability • traffic • core contributor: finagle • creator: linkerd • likes: dogs • dislikes: being woken up by a pager @olix0r
 [email protected]
  3. overview • 2010: A Failwhale Odyssey • Microservices • Finagle:

    The Once and Future Layer 5 • Introducing linkerd • Demo • Q&A
  4. Twitter, 2010 107 users 107 tweets/day 102 engineers 101 services

    101 deploys/week 102 hosts 0 datacenters 101 user-facing outages/week https://blog.twitter.com/2010/measuring-tweets
  5. Resilience is an imperative: our software runs on the truly

    dismal computers we call datacenters. Besides being heinously
 complex… they are unreliable and prone to
 operator error. Marius Eriksen @marius
 RPC Redux
  6. resilience in microservices software you didn’t write hardware you can’t

    touch network you can’t configure break in new and surprising ways and your customers shouldn’t notice
  7. datacenter [1] physical [2] link [3] network [4] transport kubernetes

    
 canal, weave, … aws, azure, digitalocean, gce, … business languages, libraries [7] application rpc [5] session [6] presentation json, protobuf, thrift, … http/2, mux, …
  8. programming finagle val users = Thrift.newIface[UserSvc](“/s/users”)
 val timelines = Thrift.newIface[TimelineSvc](“/s/timeline”)

    Http.serve(“:8080”, Service.mk[Request, Response] { req => for { user <- users.get(userReq(req)) timeline <- timelines.get(user) } yield renderHTML(user, timeline) })
  9. your server is a function trait Service[Req, Rsp] { def

    apply(req: Req): Future[Rsp] def close(deadline: Time): Future[Unit] }
  10. your server is a function trait ServiceFactory[Req, Rsp] { def

    apply(conn: ClientConnection): Future[Service[Req, Rsp]] def close(deadline: Time): Future[Unit] }
  11. your server is a function trait Filter[InReq, OutRsp, OutReq, InRsp]

    { def apply(req: InReq, service: Service[OutReq, InRsp]): Future[OutRsp] def andThen[A, B](f: Filter[OutReq, InRsp, A, B]): Filter[OutReq, InRsp, A, B] def andThen[A, B](s: Service[A, B]): Service[OutReq, InRsp] def andThen[A, B](sf: ServiceFactory[A, B]): ServiceFactory[OutReq, InRsp] }
  12. your server is a function val service: Service[http.Request, http.Response] =

    recordHandletime andThen traceRequest andThen logRequest andThen timeouts andThen myService val server: ListeningServer = Http.serve(“:8080”, service) val client: ServiceFactory[http.Request, http.Response] = retries andThen Http.newClient(“127.1:8080”)
  13. operating finagle transport security service discovery circuit breaking backpressure deadlines

    retries tracing metrics keep-alive multiplexing load balancing per-request routing service-level objectives Observe Session timeout Retries Request draining Load balancer Monitor Observe Trace Failure accrual Request timeout Pool Fail fast Expiration Dispatcher
  14. layer 5 naming applications refer to logical names
 requests are

    bound to concrete names
 delegations express routing /s/users /#/io.l5d.zk/prod/users/http /s => /#/io.l5d.zk/prod/http
  15. “It’s slow”
 is the hardest problem you’ll ever debug. Jeff

    Hodges @jmhodges
 Notes on Distributed Systems for Young Bloods
  16. lb algorithms: • round-robin • fewest connections • queue depth

    • exponentially-weighted moving average (ewma) • aperture load balancing at layer 5
  17. timeouts & retries timelines users web db timeout=400ms retries=3 timeout=400ms

    retries=2 timeout=200ms retries=3 timelines users web db
  18. timeouts & retries timelines users web db timeout=400ms retries=3 timeout=400ms

    retries=2 timeout=200ms retries=3 timelines users web db 800ms! 600ms!
  19. magic ops sprinkles transport security service discovery circuit breaking backpressure

    deadlines retries tracing metrics keep-alive multiplexing load balancing per-request routing service-level objectives Observe Session timeout Retries Request draining Load balancer Monitor Observe Trace Failure accrual Request timeout Pool Fail fast Expiration Dispatcher
  20. github.com/buoyantio/linkerd microservice rpc proxy layer-5 router aka l5d built on

    finagle & netty pluggable http, thrift, … consul, etcd, k8s, marathon, zk, … …
  21. magic operability sprinkles transport security service discovery circuit breaking backpressure

    deadlines retries tracing metrics keep-alive multiplexing load balancing per-request routing service-level objectives Service B instance linkerd Service C instance linkerd Service A instance linkerd
  22. namerd service discovery service delegates logical names to service discovery

    centralized routing policy pluggable consul, etcd, k8s, zk, …
  23. host app: a app: b app: a host app: b

    app: a app: b service-a
  24. linkerd roadmap • Netty4.1 (in upcoming 0.7.1) • HTTP/2+gRPC linkerd#174

    • Deadlines (in progress) • TLS client certs, SPIFFE • Dark Traffic • All configurable everything