$30 off During Our Annual Pro Sale. View Details »

Going The Distance

Going The Distance

Levenshtein Distance and the beauty of algorithms

Richard Schneeman

September 22, 2014
Tweet

More Decks by Richard Schneeman

Other Decks in Programming

Transcript

  1. Going
    The
    Distance
    @schneems

    View Slide

  2. They
    Call me
    @Schneems

    View Slide

  3. Ruby
    Schneems

    View Slide

  4. Ruby
    Python

    View Slide

  5. View Slide

  6. Code
    Triage
    .com

    View Slide

  7. Challenge:
    Comment on
    an issue

    View Slide

  8. Docs
    Doctor
    .org

    View Slide

  9. View Slide

  10. Top 50
    Rails
    Contrib

    View Slide

  11. Cats

    View Slide

  12. View Slide

  13. Can you keep
    Ruby weird?

    View Slide

  14. %CP%CP

    View Slide

  15. 'XGT[QPG%CP%CP

    View Slide

  16. Thank
    You!

    View Slide

  17. Algorithms:

    View Slide

  18. I went to
    Georgia Tech
    for…

    View Slide

  19. Mechanical 

    Engineering

    View Slide

  20. Self taught
    Programmer
    ~8 years

    View Slide

  21. CS is boring
    to me

    View Slide

  22. Programming
    is interesting

    View Slide

  23. Building
    programs that
    accomplish
    tasks

    View Slide

  24. But…

    View Slide

  25. Those CS
    students are
    on to
    something

    View Slide

  26. Algorithms

    View Slide

  27. Are

    View Slide

  28. Beautiful

    View Slide

  29. <3

    View Slide

  30. Algorithms
    solve
    problems

    View Slide

  31. Introducting
    A
    Problem…

    View Slide

  32. Spelling

    View Slide

  33. When you are
    tired, or
    distracted
    spelling
    becomes harder

    View Slide

  34. $ git commmit -m first
    WARNING: You called a Git command named 'commmit',
    which does not exist.
    Continuing under the assumption that you meant 'commit'
    in 0.1 seconds automatically...

    View Slide

  35. How does Git
    know?

    View Slide

  36. Introducing:

    Edge
    Distance

    View Slide

  37. The “cost”
    to change one
    word to
    another

    View Slide

  38. Less “cost”
    means less
    more likely
    match

    View Slide

  39. Cost of?
    zat => bat

    View Slide

  40. Cost of?
    zat => bat
    1

    View Slide

  41. Cost of?
    zzat => bat

    View Slide

  42. Cost of?
    bat
    2
    zzat =>

    View Slide

  43. How do we
    code it
    though?

    View Slide

  44. My First
    Attempt

    View Slide

  45. def distance(str1, str2)
    cost = 0
    str1.each_char.with_index do |char, index|
    cost += 1 if str2[index] != char
    end
    cost
    end zat bat
    =>

    View Slide

  46. def distance(str1, str2)
    cost = 0
    str1.each_char.with_index do |char, index|
    cost += 1 if str2[index] != char
    end
    cost
    end zat bat
    =>
    Cost = 1

    View Slide

  47. Perfect?

    View Slide

  48. saturday
    sunday
    Cost = ?

    View Slide

  49. saturday
    sunday
    Cost = 7

    View Slide

  50. Wat?

    View Slide

  51. Turns out I
    almost
    recreated

    View Slide

  52. Hamming
    Distance

    View Slide

  53. Hamming
    AKA: Signal
    Distance

    View Slide

  54. Measures: the
    errors
    introduced in a
    string
    Hamming

    View Slide

  55. Only valid for
    strings of
    same length
    Hamming

    View Slide

  56. Good for: Detecting
    and correcting
    errors in binary and
    telecommunications
    Hamming

    View Slide

  57. Bad for: mis-
    spelled words
    Hamming

    View Slide

  58. Does not
    include:
    Insertion

    Deletion
    Hamming

    View Slide

  59. Introducing:

    An
    Algorithm!

    View Slide

  60. Introducing:

    Levenshtein
    Distance

    View Slide

  61. How do we
    calculate
    deletion?

    View Slide

  62. distance("schneems", "zschneems")

    View Slide

  63. distance("schneems", "zschneems")
    Match!
    deletion

    View Slide

  64. str1 = “schneems”
    str2 = “zschneems”
    str1 == str2[1..-1]
    deletion

    View Slide

  65. How do we
    calculate
    insertion?

    View Slide

  66. distance("schneems", "chneems")

    View Slide

  67. distance("schneems", "chneems")
    Match!
    insertion

    View Slide

  68. str1 = “schneems”
    str2 = “chneems”
    str1[1..-1] == str2
    insertion

    View Slide

  69. substitution?

    View Slide

  70. distance("zchneems", "schneems")
    Substitution

    View Slide

  71. str1 = “zchneems”
    str2 = “schneems”
    str1[1..-1] == str2[1..-1]
    Substitution

    View Slide

  72. How do we
    calculate
    Distance?

    View Slide

  73. Pretend we
    have a
    distance()
    method

    View Slide

  74. Strings of
    different
    lengths

    View Slide

  75. distance(“”, “foo”) # => 3
    distance(“foo”, “”) # => 3

    View Slide

  76. return str2.length if str1.empty?
    return str1.length if str2.empty?
    Different
    Lengths

    View Slide

  77. Calculate
    distance
    between every
    substring

    View Slide

  78. l1 = distance(str1, str2[1..-1]) # deletion
    l2 = distance(str1[1..-1], str2) # insertion
    l3 = distance(str1[1..-1], str2[1..-1]) # substitution
    Calculate
    costs

    View Slide

  79. l1 = distance(str1, str2[1..-1]) # deletion
    l2 = distance(str1[1..-1], str2) # insertion
    l3 = distance(str1[1..-1], str2[1..-1]) # substitution
    cost = 1 + [l1,l2,l3].min
    Take
    minimum

    View Slide

  80. l1 = distance(str1, str2[1..-1]) # deletion
    l2 = distance(str1[1..-1], str2) # insertion
    l3 = distance(str1[1..-1], str2[1..-1]) # substitution
    cost = 1 + [l1,l2,l3].min
    Why did we
    add one?

    View Slide

  81. Pick the
    cheapest
    operation,
    then add one

    View Slide

  82. What about
    when
    characters
    match?

    View Slide

  83. distance(str1[1..-1], str2[1..-1]) if str1[0] == str2[0]
    Match
    str1 = “saturday”
    str2 = “sunday”

    View Slide

  84. Our powers
    combined, we
    form!

    View Slide

  85. View Slide

  86. `

    View Slide

  87. Recursive
    Levenshtein
    Distance!

    View Slide

  88. def distance(str1, str2)
    # Different lengths
    return str2.length if str1.empty?
    return str1.length if str2.empty?
    !
    return distance(str1[1..-1], str2[1..-1]) if str1[0] == str2[0] # match
    l1 = distance(str1, str2[1..-1]) # deletion
    l2 = distance(str1[1..-1], str2) # insertion
    l3 = distance(str1[1..-1], str2[1..-1]) # substitution
    return 1 + [l1,l2,l3].min # increment cost
    end
    Recursive

    View Slide

  89. distance(“saturday”, “sunday”)
    # => 3
    Recursive
    Levenshtein

    View Slide

  90. Much better

    View Slide

  91. What does
    that look
    like?

    View Slide

  92. View Slide

  93. github.com/
    schneems/
    going_the_distance

    View Slide

  94. Hmm…

    View Slide

  95. “Dirty
    Distance”
    took 8
    comparisons

    View Slide

  96. “Recursive”
    took 1647
    comparisons

    View Slide

  97. No trophy, no
    flowers, no
    flashbulbs,
    no wine,

    View Slide

  98. Ouch

    View Slide

  99. Can we do
    better?

    View Slide

  100. If you watch the
    recursive
    algorithm
    closely, you
    notice repeats

    View Slide

  101. Maybe we can
    store substring
    distance and
    use to calculate
    total distance

    View Slide

  102. I want you to
    join the
    club

    View Slide

  103. A
    members only
    club

    View Slide

  104. Matrix:

    Levenshtein
    Distance

    View Slide

  105. Matrix:

    Levenshtein
    Distance

    View Slide

  106. “” => “saturday”
    Cost?

    View Slide

  107. +---+---+
    | | S |
    +---+---+
    | | 1 |
    +---+---+
    Matrix

    View Slide

  108. +---+---+---+
    | | S | A |
    +---+---+---+
    | | 1 | 2 |
    +---+---+---+
    Matrix

    View Slide

  109. +---+---+---+---+
    | | S | A | T |
    +---+---+---+---+
    | | 1 | 2 | 3 |
    +---+---+---+---+
    Matrix

    View Slide

  110. +---+---+---+---+---+
    | | S | A | T | U |
    +---+---+---+---+---+
    | | 1 | 2 | 3 | 4 |
    +---+---+---+---+---+
    Matrix

    View Slide

  111. +---+---+---+---+---+---+
    | | S | A | T | U | R |
    +---+---+---+---+---+---+
    | | 1 | 2 | 3 | 4 | 5 |
    +---+---+---+---+---+---+
    Matrix

    View Slide

  112. +---+---+---+---+---+---+---+
    | | S | A | T | U | R | D |
    +---+---+---+---+---+---+---+
    | | 1 | 2 | 3 | 4 | 5 | 6 |
    +---+---+---+---+---+---+---+
    Matrix

    View Slide

  113. +---+---+---+---+---+---+---+---+
    | | S | A | T | U | R | D | A |
    +---+---+---+---+---+---+---+---+
    | | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
    +---+---+---+---+---+---+---+---+
    Matrix

    View Slide

  114. +---+---+---+---+---+---+---+---+---+
    | | S | A | T | U | R | D | A | Y |
    +---+---+---+---+---+---+---+---+---+
    | | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
    +---+---+---+---+---+---+---+---+---+
    Matrix

    View Slide

  115. “sunday” => “”
    Cost?

    View Slide

  116. +---+---+
    | | |
    +---+---+
    | | 0 |
    +---+---+
    | S | 1 |
    +---+---+
    Matrix

    View Slide

  117. +---+---+
    | | |
    +---+---+
    | | 0 |
    +---+---+
    | S | 1 |
    +---+---+
    | U | 2 |
    +---+---+
    Matrix

    View Slide

  118. +---+---+
    | | |
    +---+---+
    | | 0 |
    +---+---+
    | S | 1 |
    +---+---+
    | U | 2 |
    +---+---+
    | N | 3 |
    +---+---+
    Matrix

    View Slide

  119. +---+---+
    | | |
    +---+---+
    | | 0 |
    +---+---+
    | S | 1 |
    +---+---+
    | U | 2 |
    +---+---+
    | N | 3 |
    +---+---+
    | D | 4 |
    +---+---+
    Matrix

    View Slide

  120. +---+---+
    | | |
    +---+---+
    | | 0 |
    +---+---+
    | S | 1 |
    +---+---+
    | U | 2 |
    +---+---+
    | N | 3 |
    +---+---+
    | D | 4 |
    +---+---+
    | A | 5 |
    +---+---+
    Matrix

    View Slide

  121. +---+---+
    | | |
    +---+---+
    | | 0 |
    +---+---+
    | S | 1 |
    +---+---+
    | U | 2 |
    +---+---+
    | N | 3 |
    +---+---+
    | D | 4 |
    +---+---+
    | A | 5 |
    +---+---+
    | Y | 6 |
    +---+---+
    Matrix

    View Slide

  122. Now, fill in
    the matrix

    View Slide

  123. +---+---+---+---+---+---+---+---+---+---+
    | | | S | A | T | U | R | D | A | Y |
    +---+---+---+---+---+---+---+---+---+---+
    | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
    +---+---+---+---+---+---+---+---+---+---+
    | S | 1 |
    +---+---+
    | U | 2 |
    +---+---+
    | N | 3 |
    +---+---+
    | D | 4 |
    +---+---+
    | A | 5 |
    +---+---+
    | Y | 6 |
    +---+---+
    Matrix

    View Slide

  124. Break it down

    View Slide

  125. How much
    does it cost to
    change “s”
    into “s”?

    View Slide

  126. +---+---+---+
    | | | S |
    +---+---+---+
    | | 0 | 1 |
    +---+---+---+
    | S | 1 | 0 |
    +---+---+---+
    Match!
    Cost = 0

    View Slide

  127. +---+---+---+---+
    | | | S | A |
    +---+---+---+---+
    | | 0 | 1 | 2 |
    +---+---+---+---+
    | S | 1 | 0 | |
    +---+---+---+---+
    Insertion!
    Cost = ?

    View Slide

  128. +---+---+---+---+
    | | | S | A |
    +---+---+---+---+
    | | 0 | 1 | 2 |
    +---+---+---+---+
    | S | 1 | 0 | 1 |
    +---+---+---+---+
    Insertion!
    Cost = 1

    View Slide

  129. How do we
    calculate insertion
    programmatically?

    View Slide

  130. str1 = “schneems”
    str2 = “chneems”
    str1[1..-1] == str2
    Insertion!

    View Slide

  131. +---+---+---+---+
    | | | S | A |
    +---+---+---+---+
    | | 0 | 1 | 2 |
    +---+---+---+---+
    | S | 1 | 0 | |
    +---+---+---+---+
    str1 = “schneems”
    str2 = “chneems”
    str1[1..-1] == str2
    matrix[row_index][column_index - 1]
    Insertion!

    View Slide

  132. +---+---+---+---+
    | | | S | A |
    +---+---+---+---+
    | | 0 | 1 | 2 |
    +---+---+---+---+
    | S | 1 | 0 | |
    +---+---+---+---+
    Insertion!
    str1 = “schneems”
    str2 = “chneems”
    str1[1..-1] == str2
    matrix[row_index][column_index - 1]

    View Slide

  133. +---+---+---+---+
    | | | S | A |
    +---+---+---+---+
    | | 0 | 1 | 2 |
    +---+---+---+---+
    | S | 1 | 0 | |
    +---+---+---+---+
    Insertion!
    str1 = “schneems”
    str2 = “chneems”
    str1[1..-1] == str2
    matrix[row_index][column_index - 1]

    View Slide

  134. +---+---+---+---+
    | | | S | A |
    +---+---+---+---+
    | | 0 | 1 | 2 |
    +---+---+---+---+
    | S | 1 | 0 | |
    +---+---+---+---+
    Insertion!
    str1 = “schneems”
    str2 = “chneems”
    str1[1..-1] == str2
    matrix[row_index][column_index - 1]

    View Slide

  135. + cost of
    change
    (+1)

    View Slide

  136. +---+---+---+---+
    | | | S | A |
    +---+---+---+---+
    | | 0 | 1 | 2 |
    +---+---+---+---+
    | S | 1 | 0 | |
    +---+---+---+---+
    Insertion!
    str1 = “schneems”
    str2 = “chneems”
    str1[1..-1] == str2
    matrix[row_index][column_index - 1] + 1

    View Slide

  137. +---+---+---+---+
    | | | S | A |
    +---+---+---+---+
    | | 0 | 1 | 2 |
    +---+---+---+---+
    | S | 1 | 0 | 1 |
    +---+---+---+---+
    Insertion!
    Cost = 1
    str1 = “schneems”
    str2 = “chneems”
    str1[1..-1] == str2
    matrix[row_index][column_index - 1] + 1

    View Slide

  138. Keep Going

    View Slide

  139. +---+---+---+---+---+---+---+---+---+---+
    | | | S | A | T | U | R | D | A | Y |
    +---+---+---+---+---+---+---+---+---+---+
    | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
    +---+---+---+---+---+---+---+---+---+---+
    | S | 1 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
    +---+---+---+---+---+---+---+---+---+---+
    Insertion(s)

    View Slide

  140. Next Char
    “su” => “s”

    View Slide

  141. +---+---+---+
    | | | S |
    +---+---+---+
    | | 0 | 1 |
    +---+---+---+
    | S | 1 | 0 |
    +---+---+---+
    | U | 2 | |
    +---+---+---+
    Action?

    View Slide

  142. Change “su” to
    “s” is a deletion.
    How do we
    calculate?

    View Slide

  143. Deletion

    View Slide

  144. str1 = “schneems”
    str2 = “zschneems”
    str1 == str2[1..-1]
    deletion

    View Slide

  145. +---+---+---+
    | | | S |
    +---+---+---+
    | | 0 | 1 |
    +---+---+---+
    | S | 1 | 0 |
    +---+---+---+
    | U | 2 | |
    +---+---+---+
    Deletion
    str1 = “schneems”
    str2 = “zschneems”
    str1 == str2[1..-1]
    matrix[row_index - 1][column_index]

    View Slide

  146. +---+---+---+
    | | | S |
    +---+---+---+
    | | 0 | 1 |
    +---+---+---+
    | S | 1 | 0 |
    +---+---+---+
    | U | 2 | |
    +---+---+---+
    Deletion
    str1 = “schneems”
    str2 = “zschneems”
    str1 == str2[1..-1]
    matrix[row_index - 1][column_index]

    View Slide

  147. +---+---+---+
    | | | S |
    +---+---+---+
    | | 0 | 1 |
    +---+---+---+
    | S | 1 | 0 |
    +---+---+---+
    | U | 2 | |
    +---+---+---+
    Deletion
    str1 = “schneems”
    str2 = “zschneems”
    str1 == str2[1..-1]
    matrix[row_index - 1][column_index]

    View Slide

  148. +---+---+---+
    | | | S |
    +---+---+---+
    | | 0 | 1 |
    +---+---+---+
    | S | 1 | 0 |
    +---+---+---+
    | U | 2 | |
    +---+---+---+
    Deletion
    str1 = “schneems”
    str2 = “zschneems”
    str1 == str2[1..-1]
    matrix[row_index - 1][column_index] + 1

    View Slide

  149. +---+---+---+
    | | | S |
    +---+---+---+
    | | 0 | 1 |
    +---+---+---+
    | S | 1 | 0 |
    +---+---+---+
    | U | 2 | 1 |
    +---+---+---+
    Deletion
    Cost = 1
    str1 = “schneems”
    str2 = “zschneems”
    str1 == str2[1..-1]
    matrix[row_index - 1][column_index] + 1

    View Slide

  150. - Insertion
    - Deletion
    - Substitution

    View Slide

  151. - Insertion
    - Deletion
    - Substitution

    View Slide

  152. Substitution

    View Slide

  153. str1 = “zchneems”
    str2 = “schneems”
    str1[1..-1] == str2[1..-1]
    Substitution!

    View Slide

  154. +---+---+---+---+
    | | | S | A |
    +---+---+---+---+
    | | 0 | 1 | 2 |
    +---+---+---+---+
    | S | 1 | 0 | 1 |
    +---+---+---+---+
    | U | 2 | 1 | |
    +---+---+---+---+
    Substitution
    str1 = “zchneems”
    str2 = “schneems”
    str1[1..-1] == str2[1..-1]
    matrix[row_index - 1][column_index- 1]

    View Slide

  155. +---+---+---+---+
    | | | S | A |
    +---+---+---+---+
    | | 0 | 1 | 2 |
    +---+---+---+---+
    | S | 1 | 0 | 1 |
    +---+---+---+---+
    | U | 2 | 1 | |
    +---+---+---+---+
    Substitution
    str1 = “zchneems”
    str2 = “schneems”
    str1[1..-1] == str2[1..-1]
    matrix[row_index - 1][column_index- 1]

    View Slide

  156. str1 = “zchneems”
    str2 = “schneems”
    str1[1..-1] == str2[1..-1]
    matrix[row_index - 1][column_index- 1]
    +---+---+---+---+
    | | | S | A |
    +---+---+---+---+
    | | 0 | 1 | 2 |
    +---+---+---+---+
    | S | 1 | 0 | 1 |
    +---+---+---+---+
    | U | 2 | 1 | |
    +---+---+---+---+
    Substitution

    View Slide

  157. str1 = “zchneems”
    str2 = “schneems”
    str1[1..-1] == str2[1..-1]
    matrix[row_index - 1][column_index- 1]
    +---+---+---+---+
    | | | S | A |
    +---+---+---+---+
    | | 0 | 1 | 2 |
    +---+---+---+---+
    | S | 1 | 0 | 1 |
    +---+---+---+---+
    | U | 2 | 1 | |
    +---+---+---+---+
    Substitution

    View Slide

  158. str1 = “zchneems”
    str2 = “schneems”
    str1[1..-1] == str2[1..-1]
    matrix[row_index - 1][column_index- 1]
    +---+---+---+---+
    | | | S | A |
    +---+---+---+---+
    | | 0 | 1 | 2 |
    +---+---+---+---+
    | S | 1 | 0 | 1 |
    +---+---+---+---+
    | U | 2 | 1 | |
    +---+---+---+---+
    Substitution

    View Slide

  159. +---+---+---+---+
    | | | S | A |
    +---+---+---+---+
    | | 0 | 1 | 2 |
    +---+---+---+---+
    | S | 1 | 0 | 1 |
    +---+---+---+---+
    | U | 2 | 1 | 1 |
    +---+---+---+---+
    Substitution
    Cost = 1
    str1 = “zchneems”
    str2 = “schneems”
    str1[1..-1] == str2[1..-1]
    matrix[row_index - 1][column_index- 1]
    + 1

    View Slide

  160. - Insertion
    - Deletion
    - Substitution

    View Slide

  161. What about
    match?

    View Slide

  162. +---+---+---+
    | | | S |
    +---+---+---+
    | | 0 | 1 |
    +---+---+---+
    | S | 1 | 0 |
    +---+---+---+
    Match!
    str1 = “schneems”
    str2 = “schneems”
    str1[1..-1] == str2[1..-1]
    matrix[row_index - 1][column_index- 1]

    View Slide

  163. Why?

    View Slide

  164. If the current
    character matches,
    cost is to change
    previous character
    to previous sub
    string

    View Slide

  165. i.e. changing
    “” to “”
    +---+---+---+
    | | | S |
    +---+---+---+
    | | 0 | 1 |
    +---+---+---+
    | S | 1 | 0 |
    +---+---+---+

    View Slide

  166. Now What?

    View Slide

  167. Algorithm
    str2.each_char.each_with_index do |char1,i|
    str1.each_char.each_with_index do |char2, j|
    if char1 == char2
    puts [:skip, matrix[i][j]].inspect
    matrix[i + 1 ][j + 1 ] = matrix[i][j]
    else
    actions = {
    deletion: matrix[i][j + 1] + 1,
    insert: matrix[i + 1][j] + 1,
    substitution: matrix[i][j] + 1
    }
    action = actions.sort {|(k,v), (k2, v2)| v <=> v2 }.first
    puts action.inspect
    matrix[i + 1 ][j + 1 ] = action.last
    end
    each_step.call(matrix) if each_step
    end
    end

    View Slide

  168. Iterate!

    View Slide

  169. View Slide

  170. Final Cost
    +---+---+---+---+---+---+---+---+---+---+
    | | | S | A | T | U | R | D | A | Y |
    +---+---+---+---+---+---+---+---+---+---+
    | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
    +---+---+---+---+---+---+---+---+---+---+
    | S | 1 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
    +---+---+---+---+---+---+---+---+---+---+
    | U | 2 | 1 | 1 | 2 | 2 | 3 | 4 | 5 | 6 |
    +---+---+---+---+---+---+---+---+---+---+
    | N | 3 | 2 | 2 | 2 | 3 | 3 | 4 | 5 | 6 |
    +---+---+---+---+---+---+---+---+---+---+
    | D | 4 | 3 | 3 | 3 | 3 | 4 | 3 | 4 | 5 |
    +---+---+---+---+---+---+---+---+---+---+
    | A | 5 | 4 | 3 | 4 | 4 | 4 | 4 | 3 | 4 |
    +---+---+---+---+---+---+---+---+---+---+
    | Y | 6 | 5 | 4 | 4 | 5 | 5 | 5 | 4 | 3 |
    +---+---+---+---+---+---+---+---+---+---+
    3

    View Slide

  171. We can also
    get cost of
    sub strings

    View Slide

  172. “sun” => “sat”

    View Slide

  173. Final Cost
    +---+---+---+---+---+---+---+---+---+---+
    | | | S | A | T | U | R | D | A | Y |
    +---+---+---+---+---+---+---+---+---+---+
    | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
    +---+---+---+---+---+---+---+---+---+---+
    | S | 1 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
    +---+---+---+---+---+---+---+---+---+---+
    | U | 2 | 1 | 1 | 2 | 2 | 3 | 4 | 5 | 6 |
    +---+---+---+---+---+---+---+---+---+---+
    | N | 3 | 2 | 2 | 2 | 3 | 3 | 4 | 5 | 6 |
    +---+---+---+---+---+---+---+---+---+---+
    | D | 4 | 3 | 3 | 3 | 3 | 4 | 3 | 4 | 5 |
    +---+---+---+---+---+---+---+---+---+---+
    | A | 5 | 4 | 3 | 4 | 4 | 4 | 4 | 3 | 4 |
    +---+---+---+---+---+---+---+---+---+---+
    | Y | 6 | 5 | 4 | 4 | 5 | 5 | 5 | 4 | 3 |
    +---+---+---+---+---+---+---+---+---+---+
    2

    View Slide

  174. Better than
    Recursive?

    View Slide

  175. As they speed
    through the
    finish, the
    flags go down.

    View Slide

  176. 48 iterations

    View Slide

  177. Wayyyyyy
    better than
    1647

    View Slide

  178. bit.ly/
    going_the_distance

    View Slide

  179. My Problem

    View Slide

  180. I am human

    View Slide

  181. I get tired

    View Slide

  182. Machines
    don’t
    understand
    tired

    View Slide

  183. One day I
    tried typing

    View Slide

  184. $ rails generate
    migration

    View Slide

  185. But
    accidentally
    typed

    View Slide

  186. $ rails generate
    migratoon

    View Slide

  187. ERROR

    View Slide

  188. View Slide

  189. Stress is
    increased
    when we fail
    at simple tasks

    View Slide

  190. Why?
    It’s not hard

    View Slide

  191. Why
    can’t my
    software be
    more like git?

    View Slide

  192. We know what
    you’re trying to
    accomplish.
    Let’s help you
    out

    View Slide

  193. View Slide

  194. When you
    have ERROR

    View Slide

  195. Compare given
    command to
    possible
    commands

    View Slide

  196. Recommend
    smallest
    distance.

    View Slide

  197. Google:

    View Slide

  198. Google:

    View Slide

  199. Read A lot of
    words from
    real books

    View Slide

  200. ~1+ million
    words

    View Slide

  201. Count Each
    word

    View Slide

  202. Higher count,
    higher
    probability

    View Slide

  203. Get edit
    distance
    between input
    and dictionary

    View Slide

  204. Lower edit,
    higher
    probability

    View Slide

  205. Show
    Suggestion

    View Slide

  206. View Slide

  207. Cache
    correct
    spelling
    suggestions

    View Slide

  208. did_you_mean
    gem

    View Slide

  209. View Slide

  210. More distance
    measurements

    View Slide

  211. Levenshtein
    Distance

    View Slide

  212. Hamming
    Distance

    View Slide

  213. longest
    common
    subsequence
    Distance

    View Slide

  214. Manhattan
    Distance

    View Slide

  215. Tree
    Distance

    View Slide

  216. Jaro-Winkler
    Distance

    View Slide

  217. Many
    Many
    More

    View Slide

  218. Algorithms
    are
    Awesome

    View Slide

  219. Where to go
    next?

    View Slide

  220. I want to
    learn more
    about
    algorithms?

    View Slide

  221. Wikipedia!

    View Slide

  222. Rosetta code

    View Slide

  223. Give an
    algorithm
    talk

    View Slide

  224. Everyone
    suggests
    a new
    Algorithm!

    View Slide

  225. Algorithms
    are a
    way of sharing
    knowledge

    View Slide

  226. Expand your
    knowledge
    Explore
    Algorithms

    View Slide

  227. Antepenultimate
    Slide

    View Slide

  228. YAY!

    View Slide

  229. Questions
    @schneems

    View Slide

  230. Going
    The
    Distance

    View Slide