Slide 1

Slide 1 text

Regex In Depth

Slide 2

Slide 2 text

2

Slide 3

Slide 3 text

Python Mauritius UserGroup (pymug) More info: mscc.mu/python-mauritius-usergroup-pymug/ Why Where codes github.com/pymug share events twitter.com/pymugdotcom ping professionals linkedin.com/company/pymug all info pymug.com tell friends by like facebook.com/pymug 3

Slide 4

Slide 4 text

Abdur-Rahmaan Janhangeer Python Help people get into OpenSource 4

Slide 5

Slide 5 text

Regex In Depth 5

Slide 6

Slide 6 text

History of regex (from notes) 6

Slide 7

Slide 7 text

import re pattern = re.compile(r'eee') result = pattern.match('eeee') print(result.group()) eee 7

Slide 8

Slide 8 text

import re pattern = re.compile(r'ike') result = pattern.search('like sike') print(result.group()) ike 8

Slide 9

Slide 9 text

import re pattern = re.compile(r'ike') result = pattern.findall('like sike') print(result) ['ike', 'ike'] 9

Slide 10

Slide 10 text

import re pattern = r'ike' string = 'like sike' result = re.findall(pattern, string) print(result) 10

Slide 11

Slide 11 text

pattern = re.compile( r"...", re.MULTILINE) 11

Slide 12

Slide 12 text

Patterns 12

Slide 13

Slide 13 text

pattern = r"letter" string = "i wrote some letters from letters" ['letter', 'letter'] 13

Slide 14

Slide 14 text

pattern = r"l.tt.r" string = '''i wrote some letters from letters from the latter''' ['letter', 'letter', 'latter'] 14

Slide 15

Slide 15 text

\d digit pattern = r"\d\d\d.\d\d\d\d\d\d\d" string = '''Personal: +230 5764321, Office: +230 6712345''' ['230 5764321', '230 6712345'] 15

Slide 16

Slide 16 text

\s - space \D - non digit try: r"\+\d\d\d\s\d\d\d\d\d\d\d" r".\d\d\d\s\d\d\d\d\d\d\d" r".\d\d\d\s\d\d\d\d\d\d\d" pattern = r"\d\d\d\D\d\d\d\D\d\d\d\d" string = '''Personal: +230 576-4321, Office: +230 671 2345''' r'\d' 16

Slide 17

Slide 17 text

pattern = r"\d{3}\D\d{3}\D\d{4}" string = '''Personal: +230 576-4321, Office: +230 671 2345'' 17

Slide 18

Slide 18 text

\w, \W pattern = r"\w" string = '''abc123_!^%£&£%$''' ['a', 'b', 'c', '1', '2', '3', '_'] 18

Slide 19

Slide 19 text

^ start, $ end pattern = r"^\d$" string = '''234234234234234234''' 19

Slide 20

Slide 20 text

[abcd] range of chars pattern = r"l[aeiou]ce" string = '''lice lace leece lyce lvce''' ['lice', 'lace'] try: [a-z] [A-Z] [0-9] [a-zA-z] 20

Slide 21

Slide 21 text

[^exclude] pattern = r"l[^aeiou]ce" string = '''lice lace leece lyce lvce''' ['lyce', 'lvce'] 21

Slide 22

Slide 22 text

Repeat x{1, 3} pattern = r"lo{2,5} and behold" string = '''loo and behold lo and behold looooo and behold looo and behold loooooo and behold''' ['loo and behold', 'looooo and behold', 'looo and behold'] 22

Slide 23

Slide 23 text

0 or more pattern = r"lo* and behold" string = ''' loo and behold lo and behold looooo and behold looo and behold loooooo and behold l and behold''' ['loo and behold', 'lo and behold', 'looooo and behold', 'looo and behold', 'loooooo and behold', 'l and behold'] 23

Slide 24

Slide 24 text

at least once pattern = r"lo+ and behold" string = ''' loo and behold lo and behold looooo and behold looo and behold loooooo and behold l and behold''' ['loo and behold', 'lo and behold', 'looooo and behold', 'looo and behold', 'loooooo and behold'] 24

Slide 25

Slide 25 text

0 or 1 pattern = r"lo? and behold" string = ''' loo and behold lo and behold looooo and behold looo and behold loooooo and behold l and behold''' ['lo and behold', 'l and behold'] 25

Slide 26

Slide 26 text

Capturing pattern = re.compile( r"i bought (cat)?\s?(dog)?\s?(mouse)?") string = ''' i bought cat i bought dog i bought mouse i bought cat dog i bought cat mouse ''' [('cat', '', ''), ('', 'dog', ''), ('', '', 'mouse'), ('cat', 'dog', ''), ('cat', '', 'mouse')] 26

Slide 27

Slide 27 text

(?:) pattern = re.compile( r"i bought (?:cat)?\s?(?:dog)?\s?(?:mouse)?") string = ''' i bought cat i bought dog i bought mouse i bought cat dog i bought cat mouse ''' ['i bought cat \n', 'i bought dog\n', 'i bought mouse', 'i bought cat dog\n', 'i bought cat mouse'] 27

Slide 28

Slide 28 text

Or pattern = re.compile( r"(?:Mr|Mrs){1}\.\s\w*") string = ''' Mrs. sam Dr. sam Mr. Sam Miss. Sam ''' 28

Slide 29

Slide 29 text

Capturing and backref import re pattern = re.compile( r"(a)l\1") string = ''' ala alo ali ''' result = re.search(pattern, string) print(result.group()) 29

Slide 30

Slide 30 text

look ahead if there is not return match pattern = re.compile( r"c(?=[aeiou])+") string = ''' coo ca ci cw ''' ['c', 'c', 'c'] ?! not followed by ?<= look behind ?

Slide 31

Slide 31 text

Misc \b 31