$30 off During Our Annual Pro Sale. View Details »

Regex in Depth

Regex in Depth

Explains the history of regex in depth as well as covering all features in a practise-oriented way!

Abdur-Rahmaan Janhangeer

November 20, 2022
Tweet

More Decks by Abdur-Rahmaan Janhangeer

Other Decks in Technology

Transcript

  1. Regex In Depth

  2. 2

  3. Python Mauritius UserGroup (pymug) More info: mscc.mu/python-mauritius-usergroup-pymug/ Why Where codes

    github.com/pymug share events twitter.com/pymugdotcom ping professionals linkedin.com/company/pymug all info pymug.com tell friends by like facebook.com/pymug 3
  4. Abdur-Rahmaan Janhangeer Python Help people get into OpenSource 4

  5. Regex In Depth 5

  6. History of regex (from notes) 6

  7. import re pattern = re.compile(r'eee') result = pattern.match('eeee') print(result.group()) eee

    7
  8. import re pattern = re.compile(r'ike') result = pattern.search('like sike') print(result.group())

    ike 8
  9. import re pattern = re.compile(r'ike') result = pattern.findall('like sike') print(result)

    ['ike', 'ike'] 9
  10. import re pattern = r'ike' string = 'like sike' result

    = re.findall(pattern, string) print(result) 10
  11. pattern = re.compile( r"...", re.MULTILINE) 11

  12. Patterns 12

  13. pattern = r"letter" string = "i wrote some letters from

    letters" ['letter', 'letter'] 13
  14. pattern = r"l.tt.r" string = '''i wrote some letters from

    letters from the latter''' ['letter', 'letter', 'latter'] 14
  15. \d digit pattern = r"\d\d\d.\d\d\d\d\d\d\d" string = '''Personal: +230 5764321,

    Office: +230 6712345''' ['230 5764321', '230 6712345'] 15
  16. \s - space \D - non digit try: r"\+\d\d\d\s\d\d\d\d\d\d\d" r".\d\d\d\s\d\d\d\d\d\d\d"

    r".\d\d\d\s\d\d\d\d\d\d\d" pattern = r"\d\d\d\D\d\d\d\D\d\d\d\d" string = '''Personal: +230 576-4321, Office: +230 671 2345''' r'\d' 16
  17. pattern = r"\d{3}\D\d{3}\D\d{4}" string = '''Personal: +230 576-4321, Office: +230

    671 2345'' 17
  18. \w, \W pattern = r"\w" string = '''abc123_!^%£&£%$''' ['a', 'b',

    'c', '1', '2', '3', '_'] 18
  19. ^ start, $ end pattern = r"^\d$" string = '''234234234234234234'''

    19
  20. [abcd] range of chars pattern = r"l[aeiou]ce" string = '''lice

    lace leece lyce lvce''' ['lice', 'lace'] try: [a-z] [A-Z] [0-9] [a-zA-z] 20
  21. [^exclude] pattern = r"l[^aeiou]ce" string = '''lice lace leece lyce

    lvce''' ['lyce', 'lvce'] 21
  22. Repeat x{1, 3} pattern = r"lo{2,5} and behold" string =

    '''loo and behold lo and behold looooo and behold looo and behold loooooo and behold''' ['loo and behold', 'looooo and behold', 'looo and behold'] 22
  23. 0 or more pattern = r"lo* and behold" string =

    ''' loo and behold lo and behold looooo and behold looo and behold loooooo and behold l and behold''' ['loo and behold', 'lo and behold', 'looooo and behold', 'looo and behold', 'loooooo and behold', 'l and behold'] 23
  24. at least once pattern = r"lo+ and behold" string =

    ''' loo and behold lo and behold looooo and behold looo and behold loooooo and behold l and behold''' ['loo and behold', 'lo and behold', 'looooo and behold', 'looo and behold', 'loooooo and behold'] 24
  25. 0 or 1 pattern = r"lo? and behold" string =

    ''' loo and behold lo and behold looooo and behold looo and behold loooooo and behold l and behold''' ['lo and behold', 'l and behold'] 25
  26. Capturing pattern = re.compile( r"i bought (cat)?\s?(dog)?\s?(mouse)?") string = '''

    i bought cat i bought dog i bought mouse i bought cat dog i bought cat mouse ''' [('cat', '', ''), ('', 'dog', ''), ('', '', 'mouse'), ('cat', 'dog', ''), ('cat', '', 'mouse')] 26
  27. (?:) pattern = re.compile( r"i bought (?:cat)?\s?(?:dog)?\s?(?:mouse)?") string = '''

    i bought cat i bought dog i bought mouse i bought cat dog i bought cat mouse ''' ['i bought cat \n', 'i bought dog\n', 'i bought mouse', 'i bought cat dog\n', 'i bought cat mouse'] 27
  28. Or pattern = re.compile( r"(?:Mr|Mrs){1}\.\s\w*") string = ''' Mrs. sam

    Dr. sam Mr. Sam Miss. Sam ''' 28
  29. Capturing and backref import re pattern = re.compile( r"(a)l\1") string

    = ''' ala alo ali ''' result = re.search(pattern, string) print(result.group()) 29
  30. look ahead if there is not return match pattern =

    re.compile( r"c(?=[aeiou])+") string = ''' coo ca ci cw ''' ['c', 'c', 'c'] ?! not followed by ?<= look behind ?<! not behind 30
  31. Misc \b 31