Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Regex Fundamentals

Regex Fundamentals

Presented on September 16 2016 at Codemotion Warsaw, Warsaw, Poland.
http://warsaw2016.codemotionworld.com/
---------------------------------------------------------------
No matter whether you develop in PHP, Javascript, Ruby, Python or even HTML5, knowing how to write Regular Expressions is an essential skill for any developer. While a lot of people consider regexes to be black magic, used appropriately they're an awesome power-tool in your tool-belt. This talk will take you on a little walk on the wild side, show you how to build effective regular expressions, introduce you to some of the more advanced features and teach you some useful tips and tricks of the trade.
---------------------------------------------------------------

Links:
http://regexcheatsheets.com/
https://www.pluralsight.com/courses/regular-expressions-fundamentals

Juliette Reinders Folmer

September 16, 2016
Tweet

More Decks by Juliette Reinders Folmer

Other Decks in Programming

Transcript

  1. Regular Expression Fundamentals
    Juliette Reinders Folmer
    @jrf_nl
    regexcheatsheets.com

    View full-size slide

  2. pluralsight.com

    View full-size slide

  3. Wildcards on Steriods

    View full-size slide

  4. Pattern Recognition

    View full-size slide

  5. Serial numbers
    Barcodes
    Flight numbers
    CSV files
    Log files
    Email headers
    Twitter handles
    Facebook username
    Skype usernames
    MD5 hash
    Sentences
    Good passwords
    Isbn numbers
    HTML code
    Html tags
    Html attributes
    CSS code
    Urls
    Email addresses
    File names
    File extensions
    Directory paths
    Postal codes
    Telephone numbers
    Number plates
    Credit card numbers
    Bank account numbers
    Mathematical formulas
    Elements from the
    periodical table
    Patterns in text strings

    View full-size slide

  6. Whitelisting Blacklisting
    Input string Input string
    ? ?

    View full-size slide

  7. Typical Uses for Regular Expressions
    Search
    (and replace)
    String parsing
    Data mapping
    Syntax highlighting
    Data scraping
    Input validation

    View full-size slide

  8. (Sys-)Admins
    File system
    Server directives
    Data Professionals
    Query data
    Developers
    Working with strings
    Users of Regular Expressions

    View full-size slide

  9. How It Works

    View full-size slide

  10. Subject String
    Subject

    View full-size slide

  11. How It Works
    Pattern Regex
    Subject
    Function
    Engine
    Result

    View full-size slide

  12. What are the matches ?
    How many matches
    have been found ?
    Does it match ?
    Result Types

    View full-size slide

  13. Regex Engines
    POSIX
    PCRE
    ECMAscript
    Oniguruma
    Boost
    DEELX RE2
    TRE
    Pattwo
    GRETA
    GLib/
    GRegex
    FREJ
    RGX
    QT
    CL-PPCRE
    Jakarta
    Henry
    Spencer’s
    regex

    View full-size slide

  14. Syntax Overlap

    View full-size slide

  15. Still with me ?

    View full-size slide

  16. Terminology
    /[a-z0-9]+/im
    Regular Expression
    /[a-z0-9]+/im
    Delimiters
    /[a-z0-9]+/im Modifiers

    View full-size slide

  17. A a 1
    .
    ? * + {#}
    [...]
    ( ... | ... )
    ^ ... $
    \w \d \s
    g m s i
     Literals
     Wildcard
     Quantifiers
     Character ranges
     Grouping and alternation
     Anchors
     Shorthand character codes
     Modifiers
    Basic Syntax

    View full-size slide

  18. The Pattern
    # AB 12 34

    View full-size slide

  19. Visualization of the Pattern

    View full-size slide

  20. Tips & Tricks

    View full-size slide

  21. Photo by Scott Liddell
    1.
    If you need a screwdriver,
    why use a hammer ?

    View full-size slide

  22. Jamie Zawinski, August 1997
    alt.religion.emacs
    Some people, when confronted with a
    problem, think
    "I know, I'll use regular expressions."
    Now they have two problems.

    View full-size slide

  23. 2.
    Not all matches are made in heaven...
    Photo by Petr Kratochvil

    View full-size slide

  24. 3.
    Only
    Elephants
    Remember
    Everything
    © Photo by Juliette Reinders Folmer

    View full-size slide

  25. 4.
    Being negative
    isn't always a bad
    thing
    © Photo by Juliette Reinders Folmer

    View full-size slide

  26. Less
    is the
    new more
    5

    View full-size slide

  27. / /
    o
    on
    one
    one.
    one.*
    one.*s
    one.*s.
    one.*s.?
    one.*s.?t
    one.*s.?t [a-z]
    one.*s.?t[a-z]+
    one.*s.?t[a-z]+p
    = space
    one.*s.?t[a-z]+p
    one.*s.?t[a-z]+p .
    one.*s.?t[a-z]+p . {2,}
    one.*s.?t[a-z]+p .{2,},
    one.*s.?t[a-z]+p .{2,},
    We take one step forward, two steps back
    ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

    View full-size slide

  28. / /
    We take one step back, two steps forward

    View full-size slide

  29. {,m}?
    {n,}?
    {n,m}?
    *?
    +?
    ??
    Reluctant Quantifiers

    View full-size slide

  30. 6.
    Explore
    Your
    Boundaries
    Photo by Miguel A.C. Domingo

    View full-size slide

  31. Beginning of string
    Beginning of line
    Word boundaries
    End of string
    End of line

    View full-size slide

  32. 7. The first love
    is the deepest...

    View full-size slide

  33. /#?([A-F0-9]{6}|[A-F0-9]{3})/i

    View full-size slide

  34. 8.
    What's this
    global village
    people
    keep talking
    about ???

    View full-size slide

  35. Character classes PCRE POSIX
    [0-9] [^0-9] \d \D [[:digit:]] [^[:digit:]]
    [A-Za-z0-9_] [^A-Za-z0-9_] \w \W [[:word:]] [^[:word:]]
    [\t\f\r\n \v] [^\t\f\r\n \v] \s \S [[:space:]] [^[:space:]]
    [\t\f ] [^\t\f ] \h \H [[:blank:]] [^[:blank:]]
    [\r\n] [^\r\n] \v \V - -

    View full-size slide

  36. déjà vu [\w ]+
    French (fr)
    déjà vu [\w ]+
    English (en)

    View full-size slide

  37. 9.
    Escape
    and
    escape again

    View full-size slide

  38. \[ \] \( \) \| \. \? \* \+ \{ \} \^ \$ \\ \/
    Literals
    [ ] ( ) | . ? * + { } ^ $ \ / (delimiter)
    Special Meaning
    Escaping Meta Characters

    View full-size slide

  39. [(] [)] [|] [.] [?][*][+][{][}] [$] [/]
    Literals
    [ ] ( ) | . ? * + { } ^ $ \ / (delimiter)
    Special Meaning
    Escaping Meta Characters

    View full-size slide

  40. Java String.quote()
    quoteReplacement()
    PHP preg_quote()
    Matlab regexptranslate() Python re.escape()
    Objective-C escapedTemplateForString()
    escapedPatternForString()
    Ruby Regexp.escape()
    Regexp.quote()
    Escaping Arbitrary Strings
    // Javascript:
    function escapeInputString( str ) {
    return str.replace(/[[\]\/\\{}()|?+^$*.-]/g, "\\$&");
    }

    View full-size slide

  41. Matching a Literal Backslash
    \\\\ The actual
    backslash
    \\\\
    Escaping for use in regex
    \\\\
    String escape

    View full-size slide

  42. Modify
    your
    behaviour
    10

    View full-size slide

  43. Setting: Unsetting: Combined: Apply to subpattern
    (non-capturing):
    Inline Modifiers
    (?i) (?-i) (?im-sx) (?i:subp)

    View full-size slide

  44. Advanced Features
    Look around
    Named
    sub-matches
    Conditional
    sub-patterns
    Recursion
    Inline
    comments

    View full-size slide

  45. Thanks!
    Any questions ?
    Slides:
    https://speakerdeck.com/jrf
    Course:
    https://www.pluralsight.com/courses/
    regular-expressions-fundamentals

    View full-size slide