Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Ruby 2.0 regexps and other goodies
Search
taw
June 15, 2013
Programming
2
91
Ruby 2.0 regexps and other goodies
From BarCamp Berkshire 2
taw
June 15, 2013
Tweet
Share
Other Decks in Programming
See All in Programming
FOSDEM 2026: STUNMESH-go: Building P2P WireGuard Mesh Without Self-Hosted Infrastructure
tjjh89017
0
170
SourceGeneratorのススメ
htkym
0
200
それ、本当に安全? ファイルアップロードで見落としがちなセキュリティリスクと対策
penpeen
7
4k
QAフローを最適化し、品質水準を満たしながらリリースまでの期間を最短化する #RSGT2026
shibayu36
2
4.4k
生成AIを活用したソフトウェア開発ライフサイクル変革の現在値
hiroyukimori
PRO
0
100
ぼくの開発環境2026
yuzneri
0
240
Patterns of Patterns
denyspoltorak
0
1.4k
開発者から情シスまで - 多様なユーザー層に届けるAPI提供戦略 / Postman API Night Okinawa 2026 Winter
tasshi
0
210
カスタマーサクセス業務を変革したヘルススコアの実現と学び
_hummer0724
0
730
プロダクトオーナーから見たSOC2 _SOC2ゆるミートアップ#2
kekekenta
0
220
なるべく楽してバックエンドに型をつけたい!(楽とは言ってない)
hibiki_cube
0
140
Claude Codeと2つの巻き戻し戦略 / Two Rewind Strategies with Claude Code
fruitriin
0
140
Featured
See All Featured
From π to Pie charts
rasagy
0
130
The Invisible Side of Design
smashingmag
302
51k
Data-driven link building: lessons from a $708K investment (BrightonSEO talk)
szymonslowik
1
920
The innovator’s Mindset - Leading Through an Era of Exponential Change - McGill University 2025
jdejongh
PRO
1
94
Why Mistakes Are the Best Teachers: Turning Failure into a Pathway for Growth
auna
0
54
Designing Experiences People Love
moore
144
24k
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
508
140k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
46
2.7k
Ten Tips & Tricks for a 🌱 transition
stuffmc
0
71
How to Ace a Technical Interview
jacobian
281
24k
RailsConf 2023
tenderlove
30
1.3k
A designer walks into a library…
pauljervisheath
210
24k
Transcript
Ruby 2.0 regexps (and other goodies)
Never use ^ and $ unless you absolutely know what
you're doing. Only ever use \A and \z Public Service Announcement
puts "SELECT * FROM users WHERE user_id='#{id}'" if id =~
/^\d+$/ id = "'; DROP TABLE users;--\n1" ^ and $ are start and end of line. \A and \z are start and end of string. You want \A and \z 99% of time. ^ and $ do not do what you think
Character classes ASCII: • /[a-zA-Z0-9_]+/ • /\w+/ Unicode: • /[[:alnum:]]+/
(letters/numbers) • /[[:alnum:]_]+/ (mix them freely) • /\p{L}+/ (just letters) • /\p{Letter}+/ (just letters) • /[\p{L}\p{N}_]+/ • /[\p{Letter}\p{Number}_]+/
Non-capturing groups /([+-]?\d+(\.\d+)?)/ - hygiene fail /([+-]?\d+(?:\.\d+)?)/ Retrospectively, non-capturing ()
should have been default, but it's too late to change.
Named groups rx = /\A(?<name>.*)\.(?<ext>.*)\z/ "kittens.jpg".match(rx) rx["name"] #=> kittens "kittens.jpg"
=~ rx $~["name"] => "kittens" $~ is last match $1 is only alias for $~[1] etc.
String#[] Works, but verbose: "kitty.jpg" =~ /.\(\.*)\z/ ? $1 :
nil Nicer syntax: "kitty.jpg"[/\.(\.*)\z/] # => ".jpg" "kitty.jpg"[/\.(.*)\z/, 1] # => "jpg" "kitty.jpg"[/\.\K.*\z/] # => "jpg"
Irregular Expressions "((1(2)3)(45))" =~ /\A(?<expr>(?:\d+|\ (\g<expr>\))+)\z/ /\A (?<expr> (?:\d+ |
\( \g<expr> \) )+ )\z /x
String#scan "Kittens are cute!".scan(/\w+/) => ["Kittens", "are", "cute"] "x +
42".scan /(\d+)|(\p{L}+)|([+\- *\/])/ => [ [nil, "x", nil], [nil, nil, "+"], ["42", nil, nil] ]
Just one more step "x + 42".scan /(?<num>\d+) | (?<var>\p{L}+)
| (?<op> [+\-*\/])/x "x+42".scan(rx){ if $~["num"] ... end }
Regexp class escapes for you Regexp.quote(".jpg") => "\\.jpg" Regexp.new("kittens.jpg") =>
/kittens\.jpg/ Regexp.union("cute","kittens",".jpg") => /cute|kittens|\.jpg/
Perl regexps still a lot better Ruby is catching up
slowly Less regexp culture in Ruby world People often scared of regexps
%i %i[symbols everywhere] => [:symbols, :everywhere] %I[symbols in ruby#{ RUBY_VERSION
}] => [:symbols, :in, :"ruby2.0.0"]
%W %w[ruby is awesome] => ["ruby", "is", "awesome"] %W[mongod --shardsvr
--port #{port} --fork -- dbpath #{data_dir} --logappend --logpath # {logpath} --directoryperdb] %W is not just %Q[].split
%r /(?<num>\d+)|(?<var>\p{L}+)|(?<op>[+\- *\/])/ %r! (?<num>\d+) | (?<var>\p{L}+) | (?<op>[+\-*/]) !x
You can even use %r[ [] [] [] ] if balanced
__dir__ No more File.dirname(__FILE__)
#to_h OpenStruct.new( foo: 123, bar: 456 ).to_h => {:foo=>123, :bar=>456}
Not supported by most hash-like things yet (like MatchData with named groups)
Keyword arguments, before 2.0 def hello(opts={}) place = opts[:place] ||
"world" puts "Hello, #{place}" end hello # => Hello, world hello :place => "Barcamp" # => Hello, Barcamp hello place: "Barcamp" # since 1.9 # => Hello, Barcamp
Keyword arguments in 2.0 def hello(place: "world") puts "Hello, #{place}"
end hello place: "Barcamp" Mostly syntactic sugar. New style - arity 0 Old style - arity -1
Module#prepend class HelloWorld prepend Memoizer.new(:hello) def hello(who) puts "Calculating #{who}
greeting" "Hello, #{who}" end end Simpler than include + alias_method_chain
Just like Python's @annotations class Memoizer < Module def initialize(*methods)
methods.each{|m| self.send(:define_method, m){|*args| @cache ||= {} @cache[[m, *args]] ||= super(*args) } } end end
More goodies • Enumerator#lazy • Refinements (???) • Performance vs
1.9? Reports differ • Mostly backwards compatible • Everything UTF-8 by default now
What still isn't great • Perl still has better regexps
• Pathname#to_str removed :-( • More classes need #to_h • Performance still not awesome
Bonus slide - parsing XML XmlParsingRegexp = %r[ \A(?<doc>\s* (?<node>
( <\s*\w+ (?<attrs> (?:\s+ \w+\s*=\s*("[^"]*"| '[^']*'))* ) \s*> ( \g<node> | [^&<] | &\#\d+; | &\#x\h+; )* </\s*\w+\s*> ) | ( <\s*\w+ \g<attrs> \s*/> ) ) \s*)\z]x
Questions?