Upgrade to Pro — share decks privately, control downloads, hide ads and more …

CPEN 431 Final Presentation

Tony Li
April 27, 2016

CPEN 431 Final Presentation

The final summary presentation of the project details for "Design of Distributed Software Applications".

Lots of challenges and fun in this course. According to RescueTime (https://goo.gl/keiw1p) I spent over 300 hours programming from January to April.

Code is at: https://github.com/tonglil/CPEN-431-2015.

Tony Li

April 27, 2016
Tweet

Other Decks in Programming

Transcript

  1. Architecture • Leverage Go's concurrency primitives ◦ Goroutines (threads but

    lighter, faster, safer), channels (thread-safe FIFO queue), and selects (switch that waits on n-channel operations, blocking until a case can run) • Heavy lifting done on initialization ◦ Store, cache, workers creation, cluster initialization & synchronization ◦ Send sockets creation (flame-graph-ed optimization) • 1 lightweight inbound UDP requests listening ◦ Looping lightweight goroutine, reads bytes and pushes buffer into a jobs channel • N heavyweight worker goroutines ◦ Looping, pulls a request from jobs channel ◦ Deals with forwarding, replication, local storage interactions, caching ▪ Responsible for sending any requests it needs to • Interesting? ◦ Cache expiration done on GET() call, only iterates every 5s (flame-graph-ed optimization) ◦ Cluster membership delegated to open source library (hashicorp/memberlist)
  2. Replication Approach • PUT/REMOVE identical to Dynamo • GET ◦

    On receiving node, send GET_REP to all replicas (three), with response routed directly to client ◦ Client accepts first response (only 1 due to UID) • Advantages ◦ Superficially lowers GET latency ◦ Simple to implement ▪ Less state to implement - tracking avg. node latency, etc. • Disadvantages ◦ Poor scaling ▪ Increases system load by 3x ▪ No noticeable drop in throughput until high load
  3. • Memory management (due to Go) ◦ No explicit stack

    limit, cannot limit heap using command line ▪ Linux kernel too old to use ulimit command ◦ Required us to ‘manually’ limit memory usage to stay comparable ▪ Recycle memory byte buffers ◦ Use community tool to trace GC collection and memory usage • Network performance realizations ◦ ‘Fire and ForGET’ ◦ Only currently acknowledging forwards (will retry based on A1 behavior) ▪ Became too busy for A7 and A8, next step is to remove them and try only once ◦ Drop incoming requests when system is full (instead of responding with overload) ▪ Timeouts are low until high overload • Performance analysis flame graphs ◦ Took <5 minutes to set up, thanks to Go's design & community Performance and Resilience Factors
  4. Flame graphs 1: stop sharing the connection Share connection for

    both inbound requests and outbound responses (locking of connection) Create a new connection for each worker to write outbound responses (no locking of listen connection)
  5. Flame graphs 2: initialize connections for replication ahead of time

    Base Create a new connection for each replication request (system call is slow, so staccccked bro!) Replication Create a connection during worker initialization for replication (reduced stack, time to spend on other stuffs)
  6. Flame graphs 3: reuse replication connections for forwarding, improves but

    re-introduces issues Replication Use the replication connection for forwarding requests as well (note, these graphs are zoomed to the workers.worker call level) Forwarding Re-introduces the connection locking, with forwarding and replication contending for the same connection (seen in flame graph 1)
  7. Performance results from flame graph optimizations (15 nodes, local machine,

    A7 test client stage 1 results) Clients Base Replication Forwarding S1 500 820 1190 S16 1430 1670 3020 S32 1470 1760 2810 R1 590 610 1030 R16 1300 1580 2560 R32 1230 1560 2470 R64 1320 1670 2600 R128 1220 1500 2620 R256 970 1290 X X: out of buffer
  8. What We Learned • Log, log, log! ◦ Centralized logging

    service allows us to collect details on nodes & cluster state • It's about PL node selection, pick wisely ◦ NA: easy to set up, higher latency, fewer cores/node on average, weird behavior ◦ EU: more port availability & connection restriction, closer together, more cores • It's about timing ◦ Performance varies throughout the day, but stabilizes during the night (1-3 AM) ◦ But, no support when submission server is down late at night and can't post good results • It’s about resilience ◦ Our server's throughput was crippled if just one node misbehaved • It's about dedication ◦ Tailed-off effort towards end of class: ~5 hours / A6, A7, A8 ◦ Realized problem is node selection, not server, so skipped many identified optimizations • Give Go a try! ◦ Light, simple to write, great community, just waiting for GC to become generational