Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Stripe CTF3 wrap-up

Stripe
January 30, 2014

Stripe CTF3 wrap-up

CTF3, Stripe's third Capture-the-Flag, focused on distributed systems engineering with a goal of learning to build fault-tolerant, performant software while playing around with a bunch of cool cutting-edge technologies.

More here: https://stripe.com/blog/ctf3-launch.

Stripe

January 30, 2014
Tweet

Other Decks in Technology

Transcript

  1. Greg Brockman Andy Brody Christian Anderson Philipp Antoni Carl Jackson

    Jonas Schneider Siddarth Chandrasekaran Ludwig Pettersson Nelson Elhage Steve Woodrow Jorge Ortiz
  2. •  ~7.5k participants from 97 countries, 6 continents •  ~9.5k

    unique IP logins •  216 capturers Participation
  3. Challenges Scaling up to unknown capacity - 53 instances, 800

    ECU at peak User isolation Reliability and Availability (*-abilities)
  4. stripe-ctf.com. IN A gate gate (nginx, haproxy) submitter (poseidon) submitter

    (poseidon) submitter (poseidon) ctfdb (mongo) ctfdb (mongo) ctfdb (mongo) queue (RabbitMQ) . queue (RabbitMQ) colossus (LDAP, scoring) test case generator test case generator test case generator test case generator test case generator worker (docker) test case generator test case generator test case generator test case generator test case generator build (docker) gitcoin test case generator test case generator test case generator test case generator test case generator test case gen . ctfweb (sinatra) ctfweb (sinatra) ctfweb (sinatra) git push → [email protected]:level0 https://stripe-ctf.com/
  5. What Went Wrong – containerization – garbage collection - containers,

    filesystems, disk space – system stability – bugs, misconfiguration
  6. What Went Right – containerization – service architecture - queueing,

    separation of roles – load balancing – horizontal scaling
  7. Level 0: mmap •  mmap,munmap - map or unmap files

    or devices into memory •  mmap the dictionary into memory •  You can actually mmap stdin as well! •  Binary search!
  8. Level 0: Bloom filters •  Hash function: f(str) => int

    •  Look at the result of N hash functions. •  Probabilistic. •  False positives, but no false negatives. •  If you run into false positives, just push again!
  9. Level 0: Minimal perfect hashing •  Given dictionary D =

    {w₁, w₂, … wn} •  use MATH to generate a hash function f •  f : D → {0..n-1} is one-to-one •  aka every word hashes to a different small integer •  So you can build a no-collisions hash table •  CMPH - C Minimal Perfect Hashing Library •  Build this ahead of time, link it to the binary
  10. Gitcoin commit 000000216ba61aecaafb11135ee43b1674855d6ff7 Author: Alyssa P Hacker <[email protected]> Date: Wed

    Jan 22 14:10:15 2014 -0800 Give myself a Gitcoin! nonce: tahf8buC diff --git a/LEDGER.txt b/LEDGER.txt index 3890681..41980b2 100644 --- a/LEDGER.txt +++ b/LEDGER.txt@@ -7,3 +7,4 @@ carl: 30 gdb: 12 nelhage: 45 jorge: 30+user-aph123: 1
  11. Why crypto currencies? – distributed currency system – git security

    model – massively parallel problems – using the right tools for the job
  12. Git object model $ git cat-file -p 000000effe7d920b391a24633e7298469dcf51b5 tree 7da86a5b10ff6db916598b653ce63e1dc0cb73c8

    parent 0000000df4815161b72f4c5ed23e9fbf5deed922 author Alyssa P Hacker <[email protected]> 1391129313 +0000 committer Alyssa P Hacker <[email protected]> 1391129313 +0000 Mined a Gitcoin! nonce 0302d1e2 $ git show 000000 error: short SHA1 000000 is ambiguous. error: short SHA1 000000 is ambiguous.
  13. Remember wafflecopter? Want to do as few rounds of SHA1

    as possible Compute prefix, update only for nonce
  14. SHA1 is Embarrassingly Parallel – Each miner can be totally

    independent – Each miner requires little memory – Each miner requires little code
  15. Bonus Round Hash Rates: bash: 400 H/s our go miners:

    1.9 MH/s 100 cores EC2: 130 MH/s GPU: 1-2 GH/s Network: ~10 GH/s
  16. Let the mice through 1. Reduce the overall load 2. Use an

    off-the-shelf solution 3. Learn to recognize a mouse
  17. The top solutions ➔ Balance load in a simple way ➔ Learn

    to recognize a mouse ➔ Keep an eye on the backends
  18. •  Text search over ~100M of text •  Arbitrary substring

    search (not just whole words) •  There is a “free” indexing stage •  Distribute across up to 4 nodes •  Each node is limited to 500M of RAM The problem
  19. Search 101: Inverted Index /tree/A: “the quick brown fox jumps

    over …” /tree/B: “the fox is red” “the” [A, B] “quick” [A] “brown” [A] “fox” [A, B] “red” [B]
  20. Search 102: Arbitrary Substring •  “trigram index” •  Store an

    inverted index of trigrams •  “the quick brown fox …” → “the”, “he_”, “e_q”, “_qu”, “qui”, “uic”, … •  To query, look up all trigrams in the search term, and intersect •  Search(“rown”) → index[“row”] ∩ index[“own”] ◦  Check each result to verify the match
  21. Sharding •  We give you four nodes •  …but they

    all run on the same physical node during grading •  And we didn’t resource-limit grading containers (other than memory) •  So you don’t actually get more CPU, disk I/ O, or memory bandwidth •  Sharding ended up not really mattering
  22. Winning the contest •  The spec is for arbitrary substring

    search •  But we only generate/query words from a dictionary •  Some words are substrings of other words… •  … but not too many •  Use an inverted index over dictionary words
  23. Handling substrings •  Option A ◦  substrings : word →

    [all words containing that word] ◦  index : word → [list of lines containing that full word] ◦  for word in substrings[query]: ◦  results += index[word] ◦  Can compute substrings table by brute search ◦  When indexing lines, just split into words •  Option B ◦  index : word → [all lines containing that word, including as a substring] ◦  Need to do the substring search as you index each line
  24. Other ways to beat the level •  Slurp the entire

    tree into RAM and use a decent string search ◦  (not java.lang.String.indexOf() -- that’s slow!) •  Shell out to grep ◦  GNU grep is fast
  25. SQLCluster •  5 SQLite nodes •  Queries submitted to all

    nodes •  Random network and node failures •  Must maintain full linearizability of queries
  26. Octopus •  (Grumpy) network simulator •  Submits queries and checks

    for correctness •  Several “monkeys” manipulate the network: ◦  Netsplit monkey ◦  Lagsplit monkey ◦  SPOF monkey ◦  etc.
  27. Consensus Algorithms •  Raft (“In Search of an Understandable Consensus

    Algorithm”, Diego Ongaro and John Ousterhout, 2013) •  Zab (“Zab: High-performance broadcast for primary-backup systems”, Flavio P. Junqueira, Benjamin C. Reed, and Marco Serafini, 2011) •  Paxos (“The Part-Time Parliament”, Leslie Lamport, 1998, originally submitted 1990) •  Viewstamped Replication (“Viewstamped Replication: A New Primary Copy Method to Support Highly Available Distributed Systems”, Brian M. Oki and Barbara H. Liskov, 1988) etc. etc.
  28. Gotchas: Idempotency •  Octopus sends node1 a commit •  Node1

    forwards it to the leader, node0 •  Node0 processes it and sends it back •  Octopus kills the node0 㱻 node1 link vs. •  Octopus sends node1 a commit •  Node1 forwards it to the leader, node0 •  Octopus kills the node0 㱻 node1 link
  29. Gotchas: Idempotency •  How do you tell between these two

    cases? •  Naive: resubmit the query to find out! •  If the query was processed, return the old result •  Common trick: Idempotency tokens UPDATE ctf3 SET friendCount=friendCount+10, requestCount=requestCount+1, favoriteWord="jjfqcjamhpghnqq" WHERE name="carl"; SELECT * FROM ctf3
  30. Making it Fast •  Every top solution replaced sql.go • 

    Two main strategies: ◦  Write your own sqlite (or enough of it to pass) ◦  Use sqlite bindings, :memory: database •  These perform roughly equally well •  Raft has a few timers you can tune •  Golf network traffic
  31. Leaderboard Top 10 •  Everyone changed SQLite “bindings” •  Seven

    solutions used go-raft •  Two solutions used redirect-to-master •  One solution implemented Raft in C++ •  If at first you don’t succeed… ◦  Mean submissions: 1444 (stddev 1031) ◦  Max: 3946 ◦  Min: 58 (the C++ solution)
  32. Speculative: Time-based consensus •  All nodes were run on the

    same host •  Local clock: total ordering on events •  Leaderless state machine •  ??? •  Profit? •  Similar ideas to Spanner (Google)
  33. Greg Brockman Andy Brody Christian Anderson Philipp Antoni Carl Jackson

    Jonas Schneider Siddarth Chandrasekaran Ludwig Pettersson Nelson Elhage Steve Woodrow Jorge Ortiz