游騰林 TENG-LIN YU | Mail: [email protected]
NCCU - 資料視覺化工作坊
Character Classes
17
• \b
• \s
• \w
• \d
Meaning: Matches the empty string, but only at the beginning or end of a word.
Meaning: Matches any decimal digit.
Meaning: Matches Unicode whitespace characters.
Meaning: Matches Unicode word characters.
Slide 18
Slide 18 text
游騰林 TENG-LIN YU | Mail: [email protected]
NCCU - 資料視覺化工作坊
• ^:
Denote the beginning of a
regular expression
• ?:
Check for zero or one
occurrence of the preceding
character.
• |:
Logical OR
• $:
Denote the end of a regular
expression or ending of a line
• +:
Check for one or more
occurrence of the preceding
character
• \:
Escape from the normal way a
subsequent character is
interpreted.
• []:
Check for any single character in
the character set specified in []
• *:
Check for any number of
occurrences of the preceding
character.
• !:
Logical NOT
• ():
Check for a string. Create and
store variables.
• .:
Check for a single character
which is not the ending of a line
• {}:
Repeat preceding character.
Basic conceptions – 符號
18
游騰林 TENG-LIN YU | Mail: [email protected]
NCCU - 資料視覺化工作坊
• re.findall:
Return all non-overlapping matches of pattern in string,
as a list of strings or tuples.
• re.search
Scan through string looking for the first location where
the regular expression pattern produces a match, and
return a corresponding match object.
• re.split
Split string by the occurrences of pattern.
• re.sub
Return the string obtained by replacing the leftmost
non-overlapping occurrences of pattern in string by the
replacement repl.
Regular expression in Python
20
Ref: re — Regular expression operations — Python 3.11.2 documentation
Slide 21
Slide 21 text
游騰林 TENG-LIN YU | Mail: [email protected]
NCCU - 資料視覺化工作坊
Regular expression in Practice
21