Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Ruby 2.0 regexps and other goodies

Avatar for taw taw
June 15, 2013

Ruby 2.0 regexps and other goodies

From BarCamp Berkshire 2

Avatar for taw

taw

June 15, 2013
Tweet

Other Decks in Programming

Transcript

  1. Never use ^ and $ unless you absolutely know what

    you're doing. Only ever use \A and \z Public Service Announcement
  2. puts "SELECT * FROM users WHERE user_id='#{id}'" if id =~

    /^\d+$/ id = "'; DROP TABLE users;--\n1" ^ and $ are start and end of line. \A and \z are start and end of string. You want \A and \z 99% of time. ^ and $ do not do what you think
  3. Character classes ASCII: • /[a-zA-Z0-9_]+/ • /\w+/ Unicode: • /[[:alnum:]]+/

    (letters/numbers) • /[[:alnum:]_]+/ (mix them freely) • /\p{L}+/ (just letters) • /\p{Letter}+/ (just letters) • /[\p{L}\p{N}_]+/ • /[\p{Letter}\p{Number}_]+/
  4. Named groups rx = /\A(?<name>.*)\.(?<ext>.*)\z/ "kittens.jpg".match(rx) rx["name"] #=> kittens "kittens.jpg"

    =~ rx $~["name"] => "kittens" $~ is last match $1 is only alias for $~[1] etc.
  5. String#[] Works, but verbose: "kitty.jpg" =~ /.\(\.*)\z/ ? $1 :

    nil Nicer syntax: "kitty.jpg"[/\.(\.*)\z/] # => ".jpg" "kitty.jpg"[/\.(.*)\z/, 1] # => "jpg" "kitty.jpg"[/\.\K.*\z/] # => "jpg"
  6. String#scan "Kittens are cute!".scan(/\w+/) => ["Kittens", "are", "cute"] "x +

    42".scan /(\d+)|(\p{L}+)|([+\- *\/])/ => [ [nil, "x", nil], [nil, nil, "+"], ["42", nil, nil] ]
  7. Just one more step "x + 42".scan /(?<num>\d+) | (?<var>\p{L}+)

    | (?<op> [+\-*\/])/x "x+42".scan(rx){ if $~["num"] ... end }
  8. Regexp class escapes for you Regexp.quote(".jpg") => "\\.jpg" Regexp.new("kittens.jpg") =>

    /kittens\.jpg/ Regexp.union("cute","kittens",".jpg") => /cute|kittens|\.jpg/
  9. Perl regexps still a lot better Ruby is catching up

    slowly Less regexp culture in Ruby world People often scared of regexps
  10. %W %w[ruby is awesome] => ["ruby", "is", "awesome"] %W[mongod --shardsvr

    --port #{port} --fork -- dbpath #{data_dir} --logappend --logpath # {logpath} --directoryperdb] %W is not just %Q[].split
  11. #to_h OpenStruct.new( foo: 123, bar: 456 ).to_h => {:foo=>123, :bar=>456}

    Not supported by most hash-like things yet (like MatchData with named groups)
  12. Keyword arguments, before 2.0 def hello(opts={}) place = opts[:place] ||

    "world" puts "Hello, #{place}" end hello # => Hello, world hello :place => "Barcamp" # => Hello, Barcamp hello place: "Barcamp" # since 1.9 # => Hello, Barcamp
  13. Keyword arguments in 2.0 def hello(place: "world") puts "Hello, #{place}"

    end hello place: "Barcamp" Mostly syntactic sugar. New style - arity 0 Old style - arity -1
  14. Module#prepend class HelloWorld prepend Memoizer.new(:hello) def hello(who) puts "Calculating #{who}

    greeting" "Hello, #{who}" end end Simpler than include + alias_method_chain
  15. Just like Python's @annotations class Memoizer < Module def initialize(*methods)

    methods.each{|m| self.send(:define_method, m){|*args| @cache ||= {} @cache[[m, *args]] ||= super(*args) } } end end
  16. More goodies • Enumerator#lazy • Refinements (???) • Performance vs

    1.9? Reports differ • Mostly backwards compatible • Everything UTF-8 by default now
  17. What still isn't great • Perl still has better regexps

    • Pathname#to_str removed :-( • More classes need #to_h • Performance still not awesome
  18. Bonus slide - parsing XML XmlParsingRegexp = %r[ \A(?<doc>\s* (?<node>

    ( <\s*\w+ (?<attrs> (?:\s+ \w+\s*=\s*("[^"]*"| '[^']*'))* ) \s*> ( \g<node> | [^&<] | &\#\d+; | &\#x\h+; )* </\s*\w+\s*> ) | ( <\s*\w+ \g<attrs> \s*/> ) ) \s*)\z]x