Regex-fu

 Regex-fu

Presented on September 10 2020 at the PHPBenelux virtual meetup.
https://www.meetup.com/phpbenelux/events/273015264/
---------------------------------------------------------------
Regular expression, you either hate them or you love them, but do you really know how to harness their power ? Based on the PCRE implementation, this talk will show you how to get the most out of your /^regex(es)?$/, how switches affect your results, how to be less greedy, how to assert your power and let's not forget: when *not* to use regex.
---------------------------------------------------------------

2776198ea9584b6c0d4b494293b8d635?s=128

Juliette Reinders Folmer

September 10, 2020
Tweet

Transcript

  1. RegEx Fu Juliette Reinders Folmer @jrf_nl regexcheatsheets.com

  2. None
  3. Wildcards on Steroids

  4. Pattern Recognition

  5. Regex Engines POSIX PCRE ECMAscript Oniguruma Boost DEELX RE2 TRE

    Pattwo GRETA GLib/ GRegex FREJ RGX QT CL-PPCRE Jakarta Henry Spencer’s regex
  6. Regex Engines Boost DEELX RE2 TRE Pattwo GRETA GLib/ GRegex

    FREJ RGX QT CL-PPCRE Jakarta Henry Spencer’s regex Oniguruma POSIX ECMAscript PCRE
  7. Syntax Overlap

  8. PCRE

  9. Terminology /[a-z0-9]+/im Regular Expression /[a-z0-9]+/im Delimiters /[a-z0-9]+/im Modifiers

  10. A a 1 . ? * + {#} [...] (

    ... | ... ) ^ ... $ \w \d \s g m s i  Literals  Wildcard  Quantifiers  Character ranges  Grouping and alternation  Anchors  Shorthand character codes  Modifiers Basic Syntax A a 1 ? * + {#} [...] \w \d \s ( ... | ... ) ^ ... $ g m s i .
  11. Tips & Tricks

  12. Photo by Scott Liddell 1. If you need a screwdriver,

    why use a hammer ?
  13. Jamie Zawinski, August 1997 alt.religion.emacs Some people, when confronted with

    a problem, think "I know, I'll use regular expressions." Now they have two problems.
  14. 2. Nothing in life is to be feared. It is

    only to be understood. Marie Curie
  15. Allow listing Deny listing Input string Input string ? ?

  16. 3. Not all matches are made in heaven... Photo by

    Petr Kratochvil
  17. 4. Only Elephants Remember Everything © Photo by Juliette Reinders

    Folmer
  18. (?:<expr>)

  19. Less is the new more 5

  20. / / o on one one. one.* one.*s one.*s. one.*s.?

    one.*s.?t one.*s.?t [a-z] one.*s.?t[a-z]+ one.*s.?t[a-z]+p = space one.*s.?t[a-z]+p one.*s.?t[a-z]+p . one.*s.?t[a-z]+p . {2,} one.*s.?t[a-z]+p .{2,}, one.*s.?t[a-z]+p .{2,}, We take one step forward, two steps back ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
  21. / / We take one step back, two steps forward

  22. <unit>{,m}? <unit>{n,}? <unit>{n,m}? <unit>*? <unit>+? <unit>?? Reluctant Quantifiers

  23. 6. Being negative isn't always a bad thing © Photo

    by Juliette Reinders Folmer
  24. [^<chars>]

  25. 7. Explore Your Boundaries Photo by Miguel A.C. Domingo

  26. Beginning of string Beginning of line Word boundaries End of

    string End of line
  27. 8. The first love is the deepest...

  28. /#?([A-F0-9]{6}|[A-F0-9]{3})/i

  29. 9. What's this global village people keep talking about ???

  30. Character classes PCRE POSIX [0-9] [^0-9] \d \D [[:digit:]] [^[:digit:]]

    [A-Za-z0-9_] [^A-Za-z0-9_] \w \W [[:word:]] [^[:word:]] [\t\f\r\n \v] [^\t\f\r\n \v] \s \S [[:space:]] [^[:space:]] [\t\f ] [^\t\f ] \h \H [[:blank:]] [^[:blank:]] [\r\n] [^\r\n] \v \V - -
  31. déjà vu [\w ]+ French (fr) déjà vu [\w ]+

    English (en)
  32. 10. Escape and escape again

  33. String delimiter - for prog language Regex delimiter - for

    regex - for prog language Meta-characters - for regex - for prog language What to Escape ?
  34. \[ \] \( \) \| \. \? \* \+ \{

    \} \^ \$ \\ \/ Literals [ ] ( ) | . ? * + { } ^ $ \ / (delimiter) Special Meaning Escaping Meta Characters
  35. [(] [)] [|] [.] [?][*][+][{][}] [$] [/] Literals [ ]

    ( ) | . ? * + { } ^ $ \ / (delimiter) Special Meaning Escaping Meta Characters
  36. Java String.quote() quoteReplacement() PHP preg_quote() Matlab regexptranslate() Python re.escape() Objective-C

    escapedTemplateForString() escapedPatternForString() Ruby Regexp.escape() Regexp.quote() Escaping Arbitrary Strings // Javascript: function escapeInputString( str ) { return str.replace(/[[\]\/\\{}()|?+^$*.-]/g, "\\$&"); }
  37. Matching a Literal Backslash \\\\ The actual backslash \\\\ Escaping

    for use in regex \\\\ String escape
  38. Modify your behaviour 11

  39. None
  40. Setting: Unsetting: Combined: Apply to subpattern (non-capturing): Inline Modifiers (?i)

    (?-i) (?im-sx) (?i:subp)
  41. Explore

  42. None
  43. /^(( 25[0-5]| # Match 250-255 range 2[0-4][0-9]| # Match 200-249

    range [01]?[0-9]{1,2} # Match 0-199 range )\.){3} # Repeat 3 times with period (25[0-5]|2[0-4][0-9]|[01]?[0-9]{1,2}) # and once without $/x
  44. None
  45. [0] – Complete match [1] – Match against sub-pattern 1

    [2] – Match against sub-pattern 2 [3] – Match against sub-pattern 3 ... Match Array Photo by Petr Kratochvil
  46. (?<name><expr>) (?P>name)

  47. [0] – Complete match [firstname] – Match against named sub-pattern

    firstname [lastname] – Match against named sub-pattern lastname ... Match Array Photo by Petr Kratochvil
  48. Image by Gerd Altmann

  49. — Richard Feynman Know how to solve every problem that

    has been solved. What I cannot create, I do not understand. Photo by Gleick, J. Genius. p. 310f
  50. Advanced Features Look around Conditional sub-patterns Recursion Inline comments

  51. Thanks! Any questions ? Feedback: https://joind.in/talk/462b2 Slides: https://speakerdeck.com/jrf Course: https://www.pluralsight.com/courses/

    regular-expressions-fundamentals
  52. None