Upgrade to Pro — share decks privately, control downloads, hide ads and more …

UBC STAT545 2015 cm103: Introduction to Regular Expressions

Kieran Samuk
October 29, 2015

UBC STAT545 2015 cm103: Introduction to Regular Expressions

One part of a STAT545 lecture from 2015.

Kieran Samuk

October 29, 2015
Tweet

More Decks by Kieran Samuk

Other Decks in Education

Transcript

  1. Regex Anatomy ^[Hh]e walked [0-9]* meters$ “Literals” “Metacharacters” Normal letters

    and digits (+ spaces) Special characters with regex specific functions
  2. Regex Anatomy ^[Hh]e walked [0-9]* meters$ “Literals” “Metacharacters” Normal letters

    and digits (+ spaces) Special characters with regex specific functions
  3. Metacharacters: Groups and Ranges . Any character [AaBb] A or

    a or B or b [A-Z] A or B or C, … Z [0-9] 0 or1 or 2, … 9 [^A-Z] Everything but capitals (it|the) “it” OR “the”
  4. Metacharacters: Quantifiers * Zero or more times + One or

    more times ? Zero or one times {3} Exactly 3 times {1,3} 1 to 3 times {3,} 3 or more times
  5. Metacharacters: Other ^ Start of a string $ End of

    a string \ Escape (meta to literal) \w, \W [A-Za-z0-9], [^A-Za-z0-9] \d, \D [0-9], [^0-9] \s, \S Whitespace (space, tab, newline, carriage return, etc.) + not
  6. Metacharacters: Other ^ Start of a string $ End of

    a string \ Escape (meta to literal) \w, \W [A-Za-z0-9], [^A-Za-z0-9] \d, \D [0-9], [^0-9] \s, \S Whitespace (space, tab, newline, carriage return, etc.) + not
  7. Regex Challenges! RULES 1. Match ONLY the target elements 2.

    Each discrete item must be a separate match 3. THERE ARE CANDY PRIZES 1. DNA sequences 2. Email addresses 3. Smilies 4. HTML Tags (each tag separately) 5. Phone numbers 6. URLs 7. Macho Man Randy Savage Quotations 9. Citations