Pro Yearly is on sale from $80 to $50! »

My favourite algorithm

My favourite algorithm

A lightning talk about my favourite algorithm, the Burrows–Wheeler transform (http://en.wikipedia.org/wiki/Burrows%E2%80%93Wheeler_transform).

Given at Brighton Ruby Conference 2014 (http://lanyrd.com/2014/brightonruby/) and BaRuCo 2014 (http://lanyrd.com/2014/baruco/).

Cd9b247e4507fed75312e9a42070125d?s=128

Tom Stuart

July 21, 2014
Tweet

Transcript

  1. MY FAVOURITE ALGORITHM @tomstuart / BaRuCo / 2014-09-12

  2. BURROWS— WHEELER TRANSFORM

  3. B A N A N A

  4. B A N A N A B A N A

    N A B A N A N A B A N A N A B A N A N A B A N A N A
  5. B A N A N A B A N A

    N A B A N A N A B A N A N A B A N A N A B A N A N A
  6. B A N A N A B A N A

    N A B A N A N A B A N A N A B A N A N A B A N A N A N N A B A A “NNBAAA”
  7. def burrows_wheeler(string) chars = string.chars ! chars.each_index. map(&chars.method(:rotate)). sort.map(&:last).join end

  8. >> string = 'banana' => "banana"

  9. >> string = 'banana' => "banana" ! >> chars =

    string.chars => ["b", "a", "n", "a", "n", "a"]
  10. >> string = 'banana' => "banana" ! >> chars =

    string.chars => ["b", "a", "n", "a", "n", "a"] ! >> chars.each_index ! ! => #<Enumerator: ["b", "a", "n", "a", "n", "a"]:each_index>
  11. >> string = 'banana' => "banana" ! >> chars =

    string.chars => ["b", "a", "n", "a", "n", "a"] ! >> chars.each_index. map(&chars.method(:rotate)) ! => [ ["b", "a", "n", "a", "n", "a"], ["a", "n", "a", "n", "a", "b"], ["n", "a", "n", "a", "b", "a"], ["a", "n", "a", "b", "a", "n"], ["n", "a", "b", "a", "n", "a"], ["a", "b", "a", "n", "a", "n"] ]
  12. >> string = 'banana' => "banana" ! >> chars =

    string.chars => ["b", "a", "n", "a", "n", "a"] ! >> chars.each_index. map(&chars.method(:rotate)). sort => [ ["a", "b", "a", "n", "a", "n"], ["a", "n", "a", "b", "a", "n"], ["a", "n", "a", "n", "a", "b"], ["b", "a", "n", "a", "n", "a"], ["n", "a", "b", "a", "n", "a"], ["n", "a", "n", "a", "b", "a"] ]
  13. >> string = 'banana' => "banana" ! >> chars =

    string.chars => ["b", "a", "n", "a", "n", "a"] ! >> chars.each_index. map(&chars.method(:rotate)). sort.map(&:last) => ["n", "n", "b", "a", "a", "a"]
  14. >> string = 'banana' => "banana" ! >> chars =

    string.chars => ["b", "a", "n", "a", "n", "a"] ! >> chars.each_index. map(&chars.method(:rotate)). sort.map(&:last).join => "nnbaaa"
  15. >> burrows_wheeler('banana') => "nnbaaa" >> burrows_wheeler('The rain in Spain stays

    mainly in the plain') => "nnyseenn nrplmthhtT aa aapn iiiiiiS y s la"
  16. !

  17. ? ? ? ? ? ? ? ? ? ?

    ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? “NNBAAA”
  18. A B A N A N ? ? ? ?

    ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? “NNBAAA”
  19. ? A ? ? ? ? ? B ? ?

    ? ? ? ? A ? ? ? ? ? ? N ? ? ? ? ? ? A ? ? ? ? ? ? N “NNBAAA”
  20. ? A ? ? ? ? ? B ? ?

    ? ? ? ? A ? ? ? ? ? ? N ? ? ? ? ? ? A ? ? ? ? ? ? N “NNBAAA”
  21. A B A N A N B ? ? ?

    ? A ? ? ? ? ? N ? ? ? ? ? A ? ? ? ? ? N ? ? ? ? ? A “NNBAAA”
  22. ? A B ? ? ? ? B A ?

    ? ? ? ? A N ? ? ? ? ? N A ? ? ? ? ? A N ? A ? ? ? N “NNBAAA”
  23. ? A B ? ? ? ? B A ?

    ? ? ? ? A N ? ? ? ? ? N A ? ? ? ? ? A N ? A ? ? ? N “NNBAAA”
  24. A B A N A N B A ? ?

    ? A N ? ? ? ? N A ? ? ? ? A N ? ? ? ? N A B ? ? ? A “NNBAAA”
  25. ? A B A ? ? ? B A N

    ? ? ? ? A N A ? ? ? ? N A N ? A ? ? A N ? A B ? ? N “NNBAAA”
  26. ? A B A ? ? ? B A N

    ? ? ? ? A N A ? ? ? ? N A N ? A ? ? A N ? A B ? ? N “NNBAAA”
  27. A B A N A N B A N ?

    ? A N A ? ? ? N A N ? ? ? A N A B ? ? N A B A ? ? A “NNBAAA”
  28. ? A B A N ? ? B A N

    A ? ? ? A N A N ? A ? N A N ? A B ? A N ? A B A ? N “NNBAAA”
  29. ? A B A N ? ? B A N

    A ? ? ? A N A N ? A ? N A N ? A B ? A N ? A B A ? N “NNBAAA”
  30. A B A N A N B A N A

    ? A N A N ? ? N A N A B ? A N A B A ? N A B A N ? A “NNBAAA”
  31. ? A B A N A ? B A N

    A N ? A A N A N ? A B N A N ? A B A A N ? A B A N N “NNBAAA”
  32. ? A B A N A ? B A N

    A N ? A A N A N ? A B N A N ? A B A A N ? A B A N N “NNBAAA”
  33. A A N N B A N A N B

    N A N A B A A N A B A N A A “NNBAAA” B A A N A N A B A N N A
  34. A A N N B A N A N B

    N A N A B A A N A B A N A A “NNBAAA” B A N A N A B A A N A N A B A N N A
  35. def inverse_burrows_wheeler(string) chars = string.chars ! chars. inject([]) { |table|

    chars.zip(table).sort }. map(&:join) end
  36. >> inverse_burrows_wheeler('nnbaaa') => ["abanan", "anaban", "ananab", "banana", "nabana", "nanaba"]

  37. >> inverse_burrows_wheeler('nnyseenn nrplmthhtT aa aapn iiiiiiS y s la')
 =>

    [" Spain stays mainly in the plainThe rain in", " in Spain stays mainly in the plainThe rain", " in the plainThe rain in Spain stays mainly", " mainly in the plainThe rain in Spain stays", " plainThe rain in Spain stays mainly in the", " rain in Spain stays mainly in the plainThe", " stays mainly in the plainThe rain in Spain", " the plainThe rain in Spain stays mainly in", "Spain stays mainly in the plainThe rain in ", "The rain in Spain stays mainly in the plain", "ain in Spain stays mainly in the plainThe r", "ain stays mainly in the plainThe rain in Sp", "ainThe rain in Spain stays mainly in the pl", "ainly in the plainThe rain in Spain stays m", "ays mainly in the plainThe rain in Spain st", "e plainThe rain in Spain stays mainly in th", "e rain in Spain stays mainly in the plainTh", "he plainThe rain in Spain stays mainly in t", "he rain in Spain stays mainly in the plainT", "in Spain stays mainly in the plainThe rain ", "in in Spain stays mainly in the plainThe ra", "in stays mainly in the plainThe rain in Spa", "in the plainThe rain in Spain stays mainly ", "inThe rain in Spain stays mainly in the pla", "inly in the plainThe rain in Spain stays ma", "lainThe rain in Spain stays mainly in the p", "ly in the plainThe rain in Spain stays main", "mainly in the plainThe rain in Spain stays ", "n Spain stays mainly in the plainThe rain i", "n in Spain stays mainly in the plainThe rai", "n stays mainly in the plainThe rain in Spai", "n the plainThe rain in Spain stays mainly i", "nThe rain in Spain stays mainly in the plai", "nly in the plainThe rain in Spain stays mai", "pain stays mainly in the plainThe rain in S", "plainThe rain in Spain stays mainly in the ", "rain in Spain stays mainly in the plainThe ", "s mainly in the plainThe rain in Spain stay", "stays mainly in the plainThe rain in Spain ", "tays mainly in the plainThe rain in Spain s", "the plainThe rain in Spain stays mainly in ", "y in the plainThe rain in Spain stays mainl", "ys mainly in the plainThe rain in Spain sta"]
  38. def burrows_wheeler(string) chars = string.chars + ['$'] ! chars.each_index. map(&chars.method(:rotate)).

    sort.map(&:last).join end ! def inverse_burrows_wheeler(string) chars = string.chars ! chars. inject([]) { |table| chars.zip(table).sort }. map(&:join).detect { |s| s.end_with?('$') }.chop end
  39. >> burrows_wheeler('banana')
 => "annb$aa" ! >> inverse_burrows_wheeler('annb$aa')
 => "banana" !

    >> burrows_wheeler('The rain in Spain stays mainly in the plain')
 => "nnyseennn $rplmthhtT aa aapn iiiiiiS y s la" ! >> inverse_burrows_wheeler('nnyseennn $rplmthhtT aa aapn iiiiiiS y s la')
 => "The rain in Spain stays mainly in the plain"
  40. thanks! @tomstuart / tom@codon.com