Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Names are Hard

Names are Hard

Naming things is one of the hard problems. Learn why UUIDs remove the shackles of naming and let you focus on programming.

Josh Comer

June 13, 2015
Tweet

More Decks by Josh Comer

Other Decks in Programming

Transcript

  1. About me My name is Josh Comer Senior Engineering Manager

    Always looking for Dev, QE, DevOps, Ops Checkout www.liveops.com/careers
  2. Names aren’t for everyone Some names are for humans “Bob,

    ThePrintServer” Some names are for computers “192.168.0.1”
  3. Coordinated Names Provided by external source Typically monotonic integers Used

    frequently in db as primary keys As name implies, requires coordination Pretty good for humans
  4. Unique information Requires extra steps to verify uniqueness Not always

    leakable Example: email address or username Easy for humans
  5. Random As name implies they are random No coordination required

    No meaningful sorting Can be leaked safely No so great for humans
  6. Distributed Systems Scaling is a must Can’t trust the wires

    Numbers game == Failure is inevitable Global Scale (sloooow)
  7. Distributed Coordinated Names Coordination must happen between db instances Leads

    to diminishing returns on scaling Can optimize by pre-allocating blocks
  8. Distributed Unique Information Depends on DB May have to talk

    to master to determine appropriate shard
  9. Number of UUIDs TL;DR Lots three hundred forty undecillion two

    hundred eighty-two decillion three hundred sixty-six nonillion nine hundred twenty octillion nine hundred thirty-eight septillion four hundred sixty-three sextillion four hundred sixty-three quintillion three hundred seventy-four quadrillion six hundred seven trillion four hundred thirty-one billion seven hundred sixty-eight million two hundred eleven thousand four hundred and fifty-five
  10. UUID v1 Very fast Sortable Time can be extracted Epoch

    UTC time (100 nanosecond intervals) MAC address can be extracted (sometimes) Potential for collision (if done wrong)
  11. How Fast? (c/bench (u/v1)) Evaluation count : 106664880 in 60

    samples of 1777748 calls. Execution time mean : 558.988363 ns Execution time std-deviation : 10.077181 ns Execution time lower quantile : 553.440339 ns ( 2.5%) Execution time upper quantile : 576.385411 ns (97.5%) Overhead used : 8.958982 ns
  12. UUID v4 Random Fast (not as fast as v1 due

    to needing crypto randomness) No metadata Great for leaking
  13. How Fast? (c/bench (u/v4)) Evaluation count : 34240680 in 60

    samples of 570678 calls. Execution time mean : 1.747246 µs Execution time std-deviation : 34.573282 ns Execution time lower quantile : 1.714618 µs ( 2.5%) Execution time upper quantile : 1.814491 µs (97.5%) Overhead used : 8.958982 ns
  14. UUID v3+5 Example Start with base UUID 6ba7b811-9dad-11d1-80b4-00c04fd430c Add a

    namespace: www.google.com c74a196f-f19d-5ea9-bffd-a2742432fc9 Can cascade in as many namespaces as you want
  15. How Fast? (c/bench (u/v5 u/+namespace-url+ "www.google.com")) Evaluation count : 7440180

    in 60 samples of 124003 calls. Execution time mean : 8.205325 µs Execution time std-deviation : 315.094705 ns Execution time lower quantile : 7.905576 µs ( 2.5%) Execution time upper quantile : 8.892249 µs (97.5%) Overhead used : 8.958982 ns
  16. Service Discovery Great example of computers consumed names IP or

    host won’t work if multiple contacts on same machine No coordination required
  17. Timeseries v1 have time built right in Cassandra makes heavy

    use of this Easy to query for specific time series No coordination required :)
  18. Isotopes Trace every request through your system No coordination required

    Great for creating user facing views of log data Can be used to identify bottlenecks
  19. Detective Work Creator of the Melissa Virus was caught due

    to a GUID (aka UUID) Leaked their MAC Address
  20. License Keys* v3 or v5 can be used to create

    1 direction hashes based of a unique cascade of data. No coordination required Secret -> email -> first name = Verifiable Key *Use at own risk
  21. SQUUIDs A Rich Hickey invention for www.datomic.com Random sortable UUIDs

    Hybrid of v1 and v4 Time can still be retrieved 01f0cd75-1b19-4069-921b-96fd64d79bf2
  22. How Fast? (c/bench (u/squuid)) Evaluation count : 31237800 in 60

    samples of 520630 calls. Execution time mean : 1.990820 µs Execution time std-deviation : 105.333199 ns Execution time lower quantile : 1.913346 µs ( 2.5%) Execution time upper quantile : 2.111869 µs (97.5%) Overhead used : 8.958982 ns
  23. Bit Layout 0 1 2 3 0 1 2 3

    4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | time_low | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | time_mid | time_hi_and_version | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |clk_seq_hi_res | clk_seq_low | node (0-1) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | node (2-5) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+