Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Names are Hard

Names are Hard

Naming things is one of the hard problems. Learn why UUIDs remove the shackles of naming and let you focus on programming.


Josh Comer

June 13, 2015

More Decks by Josh Comer

Other Decks in Programming


  1. Names are Hard

  2. not a hard as fonts and colours though :)

  3. About me My name is Josh Comer Senior Engineering Manager

    Always looking for Dev, QE, DevOps, Ops Checkout www.liveops.com/careers
  4. https://twitter.com/codinghorror/status/506010907021828096

  5. What needs a name Services Processes Entities Objects Memory Anything

    you want to find again
  6. Names aren’t for everyone Some names are for humans “Bob,

    ThePrintServer” Some names are for computers “”
  7. What are your options Coordinated Unique information Random

  8. Coordinated Names Provided by external source Typically monotonic integers Used

    frequently in db as primary keys As name implies, requires coordination Pretty good for humans
  9. Unique information Requires extra steps to verify uniqueness Not always

    leakable Example: email address or username Easy for humans
  10. Random As name implies they are random No coordination required

    No meaningful sorting Can be leaked safely No so great for humans
  11. Distributed Systems Scaling is a must Can’t trust the wires

    Numbers game == Failure is inevitable Global Scale (sloooow)
  12. Distributed Coordinated Names Coordination must happen between db instances Leads

    to diminishing returns on scaling Can optimize by pre-allocating blocks
  13. Distributed Unique Information Depends on DB May have to talk

    to master to determine appropriate shard
  14. Distributed Random If random enough, no coordination required

  15. Downsides to coordination Prevents horizontal scaling Diminishing returns when scaling

    Split brain prevention requires sacrifices
  16. Universal Unique IDentifier http://www.rfc-base.org/rfc-4122.html “Unique” No coordination needed Good for

    computers, less for humans Supports some metadata
  17. Number of UUIDs TL;DR Lots three hundred forty undecillion two

    hundred eighty-two decillion three hundred sixty-six nonillion nine hundred twenty octillion nine hundred thirty-eight septillion four hundred sixty-three sextillion four hundred sixty-three quintillion three hundred seventy-four quadrillion six hundred seven trillion four hundred thirty-one billion seven hundred sixty-eight million two hundred eleven thousand four hundred and fifty-five
  18. Format xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx 128 bits M = Version N = Variant

    (of spec) x = a hex character
  19. UUID v0 The null UUID 00000000-0000-0000-0000-000000000000

  20. UUID v1 Time and machine based 9c031d16-0ca3-11e5-a6c0-1697f925ec7b ↑

  21. UUID v1 Very fast Sortable Time can be extracted Epoch

    UTC time (100 nanosecond intervals) MAC address can be extracted (sometimes) Potential for collision (if done wrong)
  22. How Fast? (c/bench (u/v1)) Evaluation count : 106664880 in 60

    samples of 1777748 calls. Execution time mean : 558.988363 ns Execution time std-deviation : 10.077181 ns Execution time lower quantile : 553.440339 ns ( 2.5%) Execution time upper quantile : 576.385411 ns (97.5%) Overhead used : 8.958982 ns
  23. UUID v4 Random 73b6e3a5-c97d-466a-b5c4-867ae52e2954

  24. UUID v4 Random Fast (not as fast as v1 due

    to needing crypto randomness) No metadata Great for leaking
  25. How Fast? (c/bench (u/v4)) Evaluation count : 34240680 in 60

    samples of 570678 calls. Execution time mean : 1.747246 µs Execution time std-deviation : 34.573282 ns Execution time lower quantile : 1.714618 µs ( 2.5%) Execution time upper quantile : 1.814491 µs (97.5%) Overhead used : 8.958982 ns
  26. UUID v3+5 Namespaced 3bdca4f7-fc85-3a8b-9038-7626457527b0 9989a7d2-b7fc-5b6a-84d6-556b0531a065

  27. UUID v3+5 Namespaced v3 uses MD5 v5 uses SHA1 Allows

    for reproducible names
  28. UUID v3+5 Example Start with base UUID 6ba7b811-9dad-11d1-80b4-00c04fd430c Add a

    namespace: www.google.com c74a196f-f19d-5ea9-bffd-a2742432fc9 Can cascade in as many namespaces as you want
  29. How Fast? (c/bench (u/v5 u/+namespace-url+ "www.google.com")) Evaluation count : 7440180

    in 60 samples of 124003 calls. Execution time mean : 8.205325 µs Execution time std-deviation : 315.094705 ns Execution time lower quantile : 7.905576 µs ( 2.5%) Execution time upper quantile : 8.892249 µs (97.5%) Overhead used : 8.958982 ns
  30. Where are UUIDs helpful? Everywhere :)

  31. Service Discovery Great example of computers consumed names IP or

    host won’t work if multiple contacts on same machine No coordination required
  32. Timeseries v1 have time built right in Cassandra makes heavy

    use of this Easy to query for specific time series No coordination required :)
  33. Isotopes Trace every request through your system No coordination required

    Great for creating user facing views of log data Can be used to identify bottlenecks
  34. Detective Work Creator of the Melissa Virus was caught due

    to a GUID (aka UUID) Leaked their MAC Address
  35. License Keys* v3 or v5 can be used to create

    1 direction hashes based of a unique cascade of data. No coordination required Secret -> email -> first name = Verifiable Key *Use at own risk
  36. SQUUIDs A Rich Hickey invention for www.datomic.com Random sortable UUIDs

    Hybrid of v1 and v4 Time can still be retrieved 01f0cd75-1b19-4069-921b-96fd64d79bf2
  37. How Fast? (c/bench (u/squuid)) Evaluation count : 31237800 in 60

    samples of 520630 calls. Execution time mean : 1.990820 µs Execution time std-deviation : 105.333199 ns Execution time lower quantile : 1.913346 µs ( 2.5%) Execution time upper quantile : 2.111869 µs (97.5%) Overhead used : 8.958982 ns
  38. Libraries Clojure: https://github.com/danlentz/clj-uuid Java: https://github.com/cowtowncoder/java-uuid-generator Node: https://github.com/broofa/node-uuid Ruby: https://rubygems.org/gems/uuid/versions/2.3.7

  39. Bit Layout 0 1 2 3 0 1 2 3

    4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | time_low | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | time_mid | time_hi_and_version | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |clk_seq_hi_res | clk_seq_low | node (0-1) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | node (2-5) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+