Slide 1

Slide 1 text

DEALING WITH /REGEXION/ self.conference Mike Schutte June 8, 2019 1 — 82 @tmikeschu

Slide 2

Slide 2 text

> @tmikeschu > ! > " > # $ % > & ' > 2 — 82 @tmikeschu

Slide 3

Slide 3 text

3 — 82 @tmikeschu

Slide 4

Slide 4 text

- - https://www.epi.org/publication/the-color-of-law-a-forgotten-history-of-how-our-government-segregated- america/ 4 — 82 @tmikeschu

Slide 5

Slide 5 text

VICTORY CONDITIONS > Feel at piece with the craziness of regex > Have a few strategies for deciding when to use regex > Be like "wow capture groups are amazing" 5 — 82 @tmikeschu

Slide 6

Slide 6 text

ROADMAP > My soapbox: why regex should be messy > Brief overview of regular expressions > When to use regex > Capture groups 6 — 82 @tmikeschu

Slide 7

Slide 7 text

Disclaimer 7 — 82 @tmikeschu

Slide 8

Slide 8 text

me using regex 8 — 82 @tmikeschu

Slide 9

Slide 9 text

HOW TO DEAL WITH /REGEXION/: 9 — 82 @tmikeschu

Slide 10

Slide 10 text

STRINGS AND LANGUAGE > Typos > Duplication > Patterns > Formats > Repetition 10 — 82 @tmikeschu

Slide 11

Slide 11 text

ACCEPT IT > Regex is messy > because string data is messy > because language is messy 11 — 82 @tmikeschu

Slide 12

Slide 12 text

REGULAR EXPRESSIONS: AN OVERVIEW > Born in 1951 > Popularized in 1968 > Text editor search > Lexical analysis > POSIX, Perl, PCRE > Finite state machine 12 — 82 @tmikeschu

Slide 13

Slide 13 text

https://www.youtube.com/watch?v=hprXxJHQVfQ 13 — 82 @tmikeschu

Slide 14

Slide 14 text

CODE THAT WANTS TO BE REGEXED 14 — 82 @tmikeschu

Slide 15

Slide 15 text

IF YOU ASK MORE THAN ONE QUESTION ABOUT A STRING... 15 — 82 @tmikeschu

Slide 16

Slide 16 text

DITCH && AND || FOR // 16 — 82 @tmikeschu

Slide 17

Slide 17 text

- someString.startsWith(":") && - someString.split("").some(char => Boolean(Number(char))) + /^:.*\d/.test(someString) - someString.includes("someWord") || someString.includes("someOtherWord"); + /someWord|someOtherWord/.test(someString) 17 — 82 @tmikeschu

Slide 18

Slide 18 text

(Array of characters).context < (string).context 18 — 82 @tmikeschu

Slide 19

Slide 19 text

Regex forces you to consider the string as (more of) a whole 19 — 82 @tmikeschu

Slide 20

Slide 20 text

METHODS TO USE 20 — 82 @tmikeschu

Slide 21

Slide 21 text

> Change format > String.prototype.replace (=> String) > Get substring(s) > String.prototype.match (=> Array) > Assert string qualities > Regex.prototype.test (=> Boolean) > Stateful search > Regex.prototype.exec (=> Array) 21 — 82 @tmikeschu

Slide 22

Slide 22 text

You know what those parentheses in regular expressions are, right? /(\d+)/; 22 — 82 @tmikeschu

Slide 23

Slide 23 text

CAPTURE GROUPS: KEEP IT TOGETHER /()/ 23 — 82 @tmikeschu

Slide 24

Slide 24 text

> Is familiarity worth rigidity? > Is difficulty worth flexibility? 24 — 82 @tmikeschu

Slide 25

Slide 25 text

Is difficulty worth flexibility? 25 — 82 @tmikeschu

Slide 26

Slide 26 text

TASK CREATE A FUNCTION THAT > Takes in a name in First Last format > And returns the name in Last, First format 26 — 82 @tmikeschu

Slide 27

Slide 27 text

const albus = "Albus Dumbledore"; function lastFirst(name) { // TODO } console.log(lastFirst(albus)); // => "Dumbledore, Albus" 27 — 82 @tmikeschu

Slide 28

Slide 28 text

APPROACH #1: SPLIT function lastFirst(name) { return name .split(" ") .reverse() .join(", "); } console.log(lastFirst(albus)); // => "Dumbledore, Albus" 28 — 82 @tmikeschu

Slide 29

Slide 29 text

APPROACH #2: REGEX function lastFirst(name) { const reFirstLast = /(\w+)\s(\w+)/; return name.replace(reFirstLast, "$2, $1"); } console.log(lastFirst(albus)); // => "Dumbledore, Albus" 29 — 82 @tmikeschu

Slide 30

Slide 30 text

_ someString.replace(/(cats)(dogs)/, (full, group1, group2) => { // do stuff with the groups }); _ https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ String/replace#Specifying_a_function_as_a_parameter 30 — 82 @tmikeschu

Slide 31

Slide 31 text

⚠ CHANGE ALERT 31 — 82 @tmikeschu

Slide 32

Slide 32 text

> ...middle names too. const albus = "Albus Percival Dumbledore"; fullName(albus); // => "Dumbledore, Albus Percival" 32 — 82 @tmikeschu

Slide 33

Slide 33 text

APPROACH #1: SPLIT function lastFirst(name) { return name .split(" ") .reverse() .join(", "); } console.log(lastFirst(albus)); // => "Dumbledore, Percival, Albus" 33 — 82 @tmikeschu

Slide 34

Slide 34 text

! 34 — 82 @tmikeschu

Slide 35

Slide 35 text

function lastFirst(rawName) { const names = rawName.split(" "); const maxIndex = names.length - 1; const last = names[maxIndex]; const rest = names.slice(0, maxIndex); return `${last}, ${rest.join(" ")}`; } console.log(lastFirst(albus)); // => "Dumbledore, Albus Percival" 35 — 82 @tmikeschu

Slide 36

Slide 36 text

APPROACH #2: REGEX function lastFirst(name) { const reFirstLast = /(\w+)\s(\w+)/; return name.replace(reFirstLast, "$2, $1"); } console.log(lastFirst(albus)); // => "Percival, Albus Dumbledore" 36 — 82 @tmikeschu

Slide 37

Slide 37 text

! 37 — 82 @tmikeschu

Slide 38

Slide 38 text

function lastFirst(name) { const reFirstLast = /(\w+\s*\w*)\s(\w+)/; return name.replace(reFirstLast, "$2, $1"); } console.log(lastFirst(albus)); // => "Dumbledore, Albus Percival" 38 — 82 @tmikeschu

Slide 39

Slide 39 text

- /(\w+)\s(\w+)/; + /(\w+\s*\w*)\s(\w+)/; 39 — 82 @tmikeschu

Slide 40

Slide 40 text

COMPARISON SPLIT > Calculating indices > Accommodating for zero-based counting > Array.prototype methods > String interpolation 40 — 82 @tmikeschu

Slide 41

Slide 41 text

COMPARISON REGEX > Patterns > There is a first bit > and a last bit > and sometimes extra middle bits in the first bit 41 — 82 @tmikeschu

Slide 42

Slide 42 text

THAT IS MESSY 42 — 82 @tmikeschu

Slide 43

Slide 43 text

THAT IS OKAY 43 — 82 @tmikeschu

Slide 44

Slide 44 text

THAT IS AWESOME 44 — 82 @tmikeschu

Slide 45

Slide 45 text

⚠ CHANGE ALERT 45 — 82 @tmikeschu

Slide 46

Slide 46 text

> ...middle names too > ...multiple middle names const albus = "Albus Percival Wulfric Brian Dumbledore"; lastFirst(albus); // => "Dumbledore, Albus Percival Wulfric Brian" 46 — 82 @tmikeschu

Slide 47

Slide 47 text

APPROACH #1: SPLIT function lastFirst(rawName) { const names = rawName.split(" "); const maxIndex = names.length - 1; const last = names[maxIndex]; const rest = names.slice(0, maxIndex); return `${last}, ${rest.join(" ")}`; } console.log(lastFirst(albus)); // => "Dumbledore, Albus Percival Wulfric Brian" 47 — 82 @tmikeschu

Slide 48

Slide 48 text

! 48 — 82 @tmikeschu

Slide 49

Slide 49 text

APPROACH #2: REGEX function lastFirst(name) { const reFirstLast = /(\w+\s*\w*)\s(\w+)/; return name.replace(reFirstLast, "$2, $1"); } console.log(lastFirst(albus)); // => "Wulfric, Albus Percival Brian Dumbledore" 49 — 82 @tmikeschu

Slide 50

Slide 50 text

! 50 — 82 @tmikeschu

Slide 51

Slide 51 text

- const reFirstLast = /(\w+\s*\w*)\s(\w+)/; + const reFirstLast = /(\w+(\s\w+)*)\s(\w+)/; 51 — 82 @tmikeschu

Slide 52

Slide 52 text

function lastFirst(name) { const reFirstLast = /(\w+(\s\w+)*)\s(\w+)/; return name.replace(reFirstLast, "$3, $1"); } console.log(lastFirst(albus)); // => "Dumbledore, Albus Percival Wulfric Brian" 52 — 82 @tmikeschu

Slide 53

Slide 53 text

⚠ CHANGE ALERT 53 — 82 @tmikeschu

Slide 54

Slide 54 text

> ...middle names too > ...multiple middle names > ...suffixes too const albus = "Albus Percival Wulfric Brian Dumbledore, Jr."; lastFirst(albus); // => "Dumbledore, Albus Percival Wulfric Brian, Jr." 54 — 82 @tmikeschu

Slide 55

Slide 55 text

! 55 — 82 @tmikeschu

Slide 56

Slide 56 text

APPROACH #1: SPLIT function lastFirst(rawName) { const names = rawName.split(" "); const maxIndex = names.length - 1; const last = names[maxIndex]; const rest = names.slice(0, maxIndex); return `${last}, ${rest.join(" ")}`; } console.log(lastFirst(albus)); // => "Jr., Albus Percival Wulfric Brian Dumbledore," 56 — 82 @tmikeschu

Slide 57

Slide 57 text

! 57 — 82 @tmikeschu

Slide 58

Slide 58 text

function lastFirst(rawName) { const [name, suffix] = rawName.split(", "); const names = name.split(" "); const maxIndex = names.length - 1; const last = names[maxIndex]; const rest = names.slice(0, maxIndex); const output = `${last}, ${rest.join(" ")}`; if (suffix) { return `${output}, ${suffix}`; } return output; } console.log(lastFirst(albus)); // => "Dumbledore, Albus Percival, Jr." 58 — 82 @tmikeschu

Slide 59

Slide 59 text

APPROACH #2: REGEX function lastFirst(name) { const reFirstLast = /(\w+(\s\w+)*)\s(\w+)/; return name.replace(reFirstLast, "$3, $1"); } console.log(lastFirst(albus)); // => "Dumbledore, Albus Percival, Jr." 59 — 82 @tmikeschu

Slide 60

Slide 60 text

! 60 — 82 @tmikeschu

Slide 61

Slide 61 text

! 61 — 82 @tmikeschu

Slide 62

Slide 62 text

⚠ CHANGE ALERT 62 — 82 @tmikeschu

Slide 63

Slide 63 text

> ...middle names too > ...multiple middle names > ...suffixes too > ...just first name is okay ¯_(ϑ)_/¯ const albus = "Albus"; lastFirst(albus); // => "Albus" 63 — 82 @tmikeschu

Slide 64

Slide 64 text

APPROACH 1: SPLIT function lastFirst(rawName) { const [name, suffix] = rawName.split(", "); const names = name.split(" "); const maxIndex = names.length - 1; const last = names[maxIndex]; const rest = names.slice(0, maxIndex); const output = `${last}, ${rest.join(" ")}`; if (suffix) { return `${output}, ${suffix}`; } return output; } console.log(lastFirst(albus)); // => "Albus," 64 — 82 @tmikeschu

Slide 65

Slide 65 text

! 65 — 82 @tmikeschu

Slide 66

Slide 66 text

function lastFirst(rawName) { const [name, suffix] = rawName.split(", "); const names = name.split(" "); const maxIndex = names.length - 1; const last = names[maxIndex]; const rest = names.slice(0, maxIndex); const output = `${last}, ${rest.join(" ")}`; if (suffix) { return `${output}, ${suffix}`; } if (output.endsWith(",")) { return output.slice(0, output.length - 1); } return output; } 66 — 82 @tmikeschu

Slide 67

Slide 67 text

! const output = `${last}, ${rest.join(" ")}`; if (suffix) { return `${output}, ${suffix}`; } + if (output.endsWith(",")) { + return output.slice(0, output.length - 1); + } return output 67 — 82 @tmikeschu

Slide 68

Slide 68 text

function lastFirst(rawName) { const [name, suffix] = rawName.split(", "); const names = name.split(" "); const maxIndex = names.length - 1; const last = names[maxIndex]; const rest = names.slice(0, maxIndex).join(" "); let output = last; if (Boolean(rest)) { output += `, ${rest}`; } if (suffix) { output += `, ${suffix}`; } return output; } 68 — 82 @tmikeschu

Slide 69

Slide 69 text

! - const output = `${last}, ${rest.join(" ")}`; - if (suffix) { - return `${output}, ${suffix}`; - } - if (output.lastIndexOf(",") === output.length) { - return output.slice(0, output.length - 1); - } + let output = last; + if (Boolean(rest)) { + output += `, ${rest}`; + } + if (suffix) { + output += `, ${suffix}`; + } return output; 69 — 82 @tmikeschu

Slide 70

Slide 70 text

APPROACH #2: REGEX function lastFirst(name) { const reFirstLast = /(\w+(\s\w+)*)\s(\w+)/; return name.replace(reFirstLast, "$3, $1"); } console.log(lastFirst(albus)); // => "Albus" 70 — 82 @tmikeschu

Slide 71

Slide 71 text

! 71 — 82 @tmikeschu

Slide 72

Slide 72 text

! 72 — 82 @tmikeschu

Slide 73

Slide 73 text

const reFirstLast = /(\w+(\s\w+)*)\s(\w+)/; 73 — 82 @tmikeschu

Slide 74

Slide 74 text

FLEXIBILITY > READABILITY 74 — 82 @tmikeschu

Slide 75

Slide 75 text

...ABOUT "READABILITY" > Is German readable? 75 — 82 @tmikeschu

Slide 76

Slide 76 text

REVIEW > Rich history specifically designed for analyzing and searching text > Regex is messy because string data is messy because language is messy > If you have more than one question about your string... > Capture groups are great for manipulating substrings 76 — 82 @tmikeschu

Slide 77

Slide 77 text

Go forth and parse your strings! 77 — 82 @tmikeschu

Slide 78

Slide 78 text

Embrace the /pain/! 78 — 82 @tmikeschu

Slide 79

Slide 79 text

Don't fear /regexion/! 79 — 82 @tmikeschu

Slide 80

Slide 80 text

Response /gracefully/ to change 80 — 82 @tmikeschu

Slide 81

Slide 81 text

THANK YOU! 81 — 82 @tmikeschu

Slide 82

Slide 82 text

RESOURCES > Repl-ish tool: https://regexr.com/ > Cheat sheet: ://www.rexegg.com/regex-quickstart.html > Wiki: https://en.wikipedia.org/wiki/Regular_expression > Named capture groups: https://github.com/tc39/ proposal-regexp-named-groups 82 — 82 @tmikeschu