Slide 1

Slide 1 text

RegEx Fu Juliette Reinders Folmer @jrf_nl regexcheatsheets.com

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

Wildcards on Steroids

Slide 4

Slide 4 text

Pattern Recognition

Slide 5

Slide 5 text

Regex Engines POSIX PCRE ECMAscript Oniguruma Boost DEELX RE2 TRE Pattwo GRETA GLib/ GRegex FREJ RGX QT CL-PPCRE Jakarta Henry Spencer’s regex

Slide 6

Slide 6 text

Regex Engines Boost DEELX RE2 TRE Pattwo GRETA GLib/ GRegex FREJ RGX QT CL-PPCRE Jakarta Henry Spencer’s regex Oniguruma POSIX ECMAscript PCRE

Slide 7

Slide 7 text

Syntax Overlap

Slide 8

Slide 8 text

PCRE

Slide 9

Slide 9 text

Terminology /[a-z0-9]+/im Regular Expression /[a-z0-9]+/im Delimiters /[a-z0-9]+/im Modifiers

Slide 10

Slide 10 text

A a 1 . ? * + {#} [...] ( ... | ... ) ^ ... $ \w \d \s g m s i  Literals  Wildcard  Quantifiers  Character ranges  Grouping and alternation  Anchors  Shorthand character codes  Modifiers Basic Syntax A a 1 ? * + {#} [...] \w \d \s ( ... | ... ) ^ ... $ g m s i .

Slide 11

Slide 11 text

Tips & Tricks

Slide 12

Slide 12 text

Photo by Scott Liddell 1. If you need a screwdriver, why use a hammer ?

Slide 13

Slide 13 text

Jamie Zawinski, August 1997 alt.religion.emacs Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.

Slide 14

Slide 14 text

2. Nothing in life is to be feared. It is only to be understood. Marie Curie

Slide 15

Slide 15 text

Allow listing Deny listing Input string Input string ? ?

Slide 16

Slide 16 text

3. Not all matches are made in heaven... Photo by Petr Kratochvil

Slide 17

Slide 17 text

4. Only Elephants Remember Everything © Photo by Juliette Reinders Folmer

Slide 18

Slide 18 text

(?:)

Slide 19

Slide 19 text

Less is the new more 5

Slide 20

Slide 20 text

/ / o on one one. one.* one.*s one.*s. one.*s.? one.*s.?t one.*s.?t [a-z] one.*s.?t[a-z]+ one.*s.?t[a-z]+p = space one.*s.?t[a-z]+p one.*s.?t[a-z]+p . one.*s.?t[a-z]+p . {2,} one.*s.?t[a-z]+p .{2,}, one.*s.?t[a-z]+p .{2,}, We take one step forward, two steps back ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

Slide 21

Slide 21 text

/ / We take one step back, two steps forward

Slide 22

Slide 22 text

{,m}? {n,}? {n,m}? *? +? ?? Reluctant Quantifiers

Slide 23

Slide 23 text

6. Being negative isn't always a bad thing © Photo by Juliette Reinders Folmer

Slide 24

Slide 24 text

[^]

Slide 25

Slide 25 text

7. Explore Your Boundaries Photo by Miguel A.C. Domingo

Slide 26

Slide 26 text

Beginning of string Beginning of line Word boundaries End of string End of line

Slide 27

Slide 27 text

8. The first love is the deepest...

Slide 28

Slide 28 text

/#?([A-F0-9]{6}|[A-F0-9]{3})/i

Slide 29

Slide 29 text

9. What's this global village people keep talking about ???

Slide 30

Slide 30 text

Character classes PCRE POSIX [0-9] [^0-9] \d \D [[:digit:]] [^[:digit:]] [A-Za-z0-9_] [^A-Za-z0-9_] \w \W [[:word:]] [^[:word:]] [\t\f\r\n \v] [^\t\f\r\n \v] \s \S [[:space:]] [^[:space:]] [\t\f ] [^\t\f ] \h \H [[:blank:]] [^[:blank:]] [\r\n] [^\r\n] \v \V - -

Slide 31

Slide 31 text

déjà vu [\w ]+ French (fr) déjà vu [\w ]+ English (en)

Slide 32

Slide 32 text

10. Escape and escape again

Slide 33

Slide 33 text

String delimiter - for prog language Regex delimiter - for regex - for prog language Meta-characters - for regex - for prog language What to Escape ?

Slide 34

Slide 34 text

\[ \] \( \) \| \. \? \* \+ \{ \} \^ \$ \\ \/ Literals [ ] ( ) | . ? * + { } ^ $ \ / (delimiter) Special Meaning Escaping Meta Characters

Slide 35

Slide 35 text

[(] [)] [|] [.] [?][*][+][{][}] [$] [/] Literals [ ] ( ) | . ? * + { } ^ $ \ / (delimiter) Special Meaning Escaping Meta Characters

Slide 36

Slide 36 text

Java String.quote() quoteReplacement() PHP preg_quote() Matlab regexptranslate() Python re.escape() Objective-C escapedTemplateForString() escapedPatternForString() Ruby Regexp.escape() Regexp.quote() Escaping Arbitrary Strings // Javascript: function escapeInputString( str ) { return str.replace(/[[\]\/\\{}()|?+^$*.-]/g, "\\$&"); }

Slide 37

Slide 37 text

Matching a Literal Backslash \\\\ The actual backslash \\\\ Escaping for use in regex \\\\ String escape

Slide 38

Slide 38 text

Modify your behaviour 11

Slide 39

Slide 39 text

No content

Slide 40

Slide 40 text

Setting: Unsetting: Combined: Apply to subpattern (non-capturing): Inline Modifiers (?i) (?-i) (?im-sx) (?i:subp)

Slide 41

Slide 41 text

Explore

Slide 42

Slide 42 text

No content

Slide 43

Slide 43 text

/^(( 25[0-5]| # Match 250-255 range 2[0-4][0-9]| # Match 200-249 range [01]?[0-9]{1,2} # Match 0-199 range )\.){3} # Repeat 3 times with period (25[0-5]|2[0-4][0-9]|[01]?[0-9]{1,2}) # and once without $/x

Slide 44

Slide 44 text

No content

Slide 45

Slide 45 text

[0] – Complete match [1] – Match against sub-pattern 1 [2] – Match against sub-pattern 2 [3] – Match against sub-pattern 3 ... Match Array Photo by Petr Kratochvil

Slide 46

Slide 46 text

(?) (?P>name)

Slide 47

Slide 47 text

[0] – Complete match [firstname] – Match against named sub-pattern firstname [lastname] – Match against named sub-pattern lastname ... Match Array Photo by Petr Kratochvil

Slide 48

Slide 48 text

Image by Gerd Altmann

Slide 49

Slide 49 text

— Richard Feynman Know how to solve every problem that has been solved. What I cannot create, I do not understand. Photo by Gleick, J. Genius. p. 310f

Slide 50

Slide 50 text

Advanced Features Look around Conditional sub-patterns Recursion Inline comments

Slide 51

Slide 51 text

Thanks! Any questions ? Feedback: https://joind.in/talk/462b2 Slides: https://speakerdeck.com/jrf Course: https://www.pluralsight.com/courses/ regular-expressions-fundamentals

Slide 52

Slide 52 text

No content