Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Language of Regular Expressions. So You Think You Can Speak It? by Rustam Mehmandarov

388bd0ce1b0edcbdd87bbcd4d9e7772c?s=47 Riga Dev Day
March 13, 2016
50

The Language of Regular Expressions. So You Think You Can Speak It? by Rustam Mehmandarov

388bd0ce1b0edcbdd87bbcd4d9e7772c?s=128

Riga Dev Day

March 13, 2016
Tweet

Transcript

  1. The Language of Regular Expressions. So You Think You Can

    Speak It? Rustam Mehmandarov Riga Dev Day
  2. The Vision http://xkcd.com/208/

  3. …but regex ain’t one! Oh, wait! http://xkcd.com/1171/

  4. The Reality

  5. None
  6. (([0-9]{4})-([0-9]{2})-([0-9]{2})).*\sINFO\s(.*)

  7. Log4j Log File

  8. .*(INFO|WARN).*

  9. .*(INFO|WARN).*

  10. (([0-9]{4})-([0-9]{2})-([0-9]{2})).*ERROR(.*)

  11. (([0-9]{4})-([0-9]{2})-([0-9]{2})).*ERROR(.*)

  12. ^.*192\.168\.0\.6[^9](.*)$

  13. ^.*192.168.0.6[^9](.*)

  14. ^.*192\.168\.0\.6[^9](.*)$

  15. ^.*192\.168\.0\.6[0-9]+[^9](.*)$

  16. flavou?r

  17. Recap: Quantifiers * + ? {num} {num, num}

  18. Recap: Grouping .*(INFO|WARN)(.*) .*(INFO|WARN)(?:.*)

  19. (WA) 2014-09-09 WAR FILE WARN [com.example.logging.MyLog]

  20. (?=WARN)WA 2014-09-09 WAR FILE WARN [com.example.logging.MyLog]

  21. (?=WARN)WA

  22. Recap: Lookaround (?=foo) -> Lookahead (?<=foo) -> Lookbehind (?!foo) ->

    Negative Lookahead (?<!foo) -> Negative Lookbehind
  23. Lookahead: Example

  24. The list

  25. Backreferences

  26. Backreferences (contd.)

  27. Text: OSDC is awesome! HTML: OSDC is <em>awesome</em>! Regex: <.+>

    Result:
  28. Text: OSDC is awesome! HTML: OSDC is <em>awesome</em>! Regex: <.+?>

    Result:
  29. None
  30. http://stackoverflow.com/questions/1732348/ regex-match-open-tags-except-xhtml-self- contained-tags

  31. Quantifiers Revisited Greedy: *, +, ?, {num, num} Non-greedy: *?,

    +?, ??, {num, num}?
  32. Final Recap • Know your data! – Think what you

    should match – Think what you should not match • Know your flavor • Know your engine (DFA, NFA) – Backtracking • Greediness • Non-capturing parenthesis • Anchors
  33. MOAR! EXAMPLES!

  34. Matching an IP - 1 Idea 1: ^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$ Result:

  35. Matching an IP - 2 Idea 2: ^\d\d\d\.\d\d\d\.\d\d\d\.\d\d\d$ Result:

  36. Matching an IP - 3 Idea 3: ^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$ Result:

  37. Matching an IP - 4 Idea 4: ^ ([01]?\d\d?|2[0-4]\d|25[0-5])\. ([01]?\d\d?|2[0-4]\d|25[0-5])\.

    ([01]?\d\d?|2[0-4]\d|25[0-5])\. ([01]?\d\d?|2[0-4]\d|25[0-5]) $ Result:
  38. Your New Reality! http://xkcd.com/208/

  39. Thank you! rmehmandarov rm@computas.com

  40. None