Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Refactoring_and_Readability_-_CRHS_-_Perl_Confe...

Bruce Gray
June 24, 2020
67

 Refactoring_and_Readability_-_CRHS_-_Perl_Conference_20200624.pdf

Bruce Gray

June 24, 2020
Tweet

Transcript

  1. Or

  2. Fun for the whole family • Regular Expressions (Regex) •

    Multi-dimensional Data Structures • Trees •
  3. Fun for the whole family • Regular Expressions (Regex) •

    Multi-dimensional Data Structures • Trees •
  4. Fun for the whole family • Regular Expressions (Regex) •

    Multi-dimensional Data Structures • Trees •
  5. Fun for the whole family • Regular Expressions (Regex) •

    Multi-dimensional Data Structures • Trees •
  6. Fun for the whole family • Regular Expressions (Regex) •

    Multi-dimensional Data Structures • Trees • Q & A + Bonus!!!
  7. Sinkholes
 in the
 space/code continuum or, how to you decide

    what to learn, when you cannot judge its value until *after* you learn it?
  8. 
 Forces Surrounding Code Code Easy to write Standard form

    Ease of change Easy to read Performance Boundaries of Responsibility
  9. 
 Forces Surrounding Code Code Easy to write Standard form

    Ease of change Easy to read Performance Boundaries of Responsibility
  10. –Jamie Zawinski “Some people, when confronted with a problem, think


    `I know, I'll use regular expressions.`
 Now they have two problems.”
  11. File Globbing Regex . Literal dot Any single character ?

    Any single character Quantifier: Once or none * Any string without '/' Quantifier: Zero or more + Quantifier: One or more {2,4} Alternation: '2' or '4' Quantifier: Two to four times (2|4) Alternation: '2' or '4'
  12. $ find . -not -type d ./SS_20200503/Screen Shot 2020-05-01 at

    4.45.30 PM.png ./SS_20200503/Screen Shot 2020-04-28 at 1.29.14 PM.png ./SS_20200503/Screen Shot 2020-04-24 at 8.08.38 PM.png ./SS_20200503/Screen Shot 2020-04-28 at 2.35.18 PM.png ./SS_20200527/Screen Shot 2020-05-03 at 7.40.14 PM.png …
  13. ./SS_20200503/Screen Shot 2020-05-01 at 4.45.30 PM.png ./SS_20200503/Screen Shot 2020-04-28 at

    1.29.14 PM.png ./SS_20200503/Screen Shot 2020-04-24 at 8.08.38 PM.png ./SS_20200503/Screen Shot 2020-04-28 at 2.35.18 PM.png ./SS_20200527/Screen Shot 2020-05-03 at 7.40.14 PM.png …
  14. ./SS_20200503/Screen Shot 2020-05-01 at 4.45.30 PM.png ./SS_20200503/Screen Shot 2020-04-28 at

    1.29.14 PM.png ./SS_20200503/Screen Shot 2020-04-24 at 8.08.38 PM.png ./SS_20200503/Screen Shot 2020-04-28 at 2.35.18 PM.png ./SS_20200527/Screen Shot 2020-05-03 at 7.40.14 PM.png …
  15. $ find . -not -type d | ack -v \

    'Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$'
  16. $ find . -not -type d | ack -v \

    'Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$'
  17. $ find . -not -type d | ack -v \

    'Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$'
  18. $ find . -not -type d | ack -v \

    'Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$'
  19. $ find . -not -type d | ack -v \

    'Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$'
  20. $ find . -not -type d | ack -v \

    'Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$ '
  21. $ find . -not -type d | ack -v \

    'Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$'
  22. $ find . -not -type d | ack -v \

    'Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$' ./SS_20190828/Screen Shot 2019-08-10 at 9.28.52 PM.png ./SS_20190828/Screen Shot 2019-08-15 at 8.59.19 PM.png …
  23. $ find . -not -type d | ack -v \

    'Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$' ./SS_20190828/Screen Shot 2019-08-10 at 9.28.52 PM.png ./SS_20190828/Screen Shot 2019-08-15 at 8.59.19 PM.png …
  24. $ find . -not -type d | ack -v \

    'Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$' ./SS_20190828/Screen Shot 2019-08-10 at 9.28.52 PM.png ./SS_20190828/Screen Shot 2019-08-15 at 8.59.19 PM.png …
  25. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$'
  26. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$' ./SS_20200503/Screen Shot 2020-04-27 at 7.14.42 AM.png ./SS_20190828/Screen Shot 2019-08-21 at 8.04.16 AM.png …
  27. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$' ./SS_20200503/Screen Shot 2020-04-27 at 7.14.42 AM.png ./SS_20190828/Screen Shot 2019-08-21 at 8.04.16 AM.png …
  28. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$' ./SS_20200503/Screen Shot 2020-04-27 at 7.14.42 AM.png ./SS_20190828/Screen Shot 2019-08-21 at 8.04.16 AM.png …
  29. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d\.\d\d\.\d\d [AP]M\.png$' ./SS_20200503/Screen Shot 2020-04-27 at 7.14.42 AM.png ./SS_20190828/Screen Shot 2019-08-21 at 8.04.16 AM.png …
  30. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d\.\d\d\.\d\d [AP]M\.png$'
  31. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d\.\d\d\.\d\d [AP]M\.png$' ./SS_20200503/Screen Shot 2020-04-29 at 11.52.35 PM.png ./SS_20190828/Screen Shot 2019-07-31 at 10.14.57 AM.png
  32. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d\.\d\d\.\d\d [AP]M\.png$' ./SS_20200503/Screen Shot 2020-04-29 at 11.52.35 PM.png ./SS_20190828/Screen Shot 2019-07-31 at 10.14.57 AM.png
  33. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d\.\d\d\.\d\d [AP]M\.png$' ./SS_20200503/Screen Shot 2020-04-29 at 11.52.35 PM.png ./SS_20190828/Screen Shot 2019-07-31 at 10.14.57 AM.png
  34. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d{}\.\d\d\.\d\d [AP]M\.png$' ./SS_20200503/Screen Shot 2020-04-29 at 11.52.35 PM.png ./SS_20190828/Screen Shot 2019-07-31 at 10.14.57 AM.png
  35. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d{}\.\d\d\.\d\d [AP]M\.png$' ./SS_20200503/Screen Shot 2020-04-29 at 11.52.35 PM.png ./SS_20190828/Screen Shot 2019-07-31 at 10.14.57 AM.png
  36. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d{1,2}\.\d\d\.\d\d [AP]M\.png$' ./SS_20200503/Screen Shot 2020-04-29 at 11.52.35 PM.png ./SS_20190828/Screen Shot 2019-07-31 at 10.14.57 AM.png
  37. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d{1,2}\.\d\d\.\d\d [AP]M\.png$'
  38. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d{1,2}\.\d\d\.\d\d [AP]M\.png$' ./SS_20200111/color_wheel_2.numbers ./SS_20200111/color_wheel_1.numbers ./SS_20200131/RenFlorenceWalk.mp3 ./SS_20200131/spa-barcelona-city.mp3 ./SS_20200131/six_flags_20190112.txt ./SS_20190716/c_04.png ./SS_20190716/c_77_edge.png
  39. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d{1,2}\.\d\d\.\d\d [AP]M\.png$' ./SS_20200111/color_wheel_2.numbers ./SS_20200111/color_wheel_1.numbers ./SS_20200131/RenFlorenceWalk.mp3 ./SS_20200131/spa-barcelona-city.mp3 ./SS_20200131/six_flags_20190112.txt ./SS_20190716/c_04.png ./SS_20190716/c_77_edge.png
  40. my $end_of_prefix = -1; for my $scheme ( qw<https http

    ftp> ) { my $prefix = $scheme . '://'; my $p_len = length $prefix; if ( substr( $url, 0, $p_len ) eq $prefix ) { $end_of_prefix = $p_len; last; } } if ( $end_of_prefix == -1 ) { warn "Unexpected URL format: $url"; next; } my $slash_pos = index $url, '/', $end_of_prefix; if ( $slash_pos == -1 ) { warn "Unexpected URL format: $url"; next; } my $dot_pos = rindex $url, '.', $slash_pos; if ( $dot_pos == -1 ) { warn "Unexpected URL format: $url"; next; } my $start = $end_of_prefix; if ( my $i = index substr( $url, $start, $dot_pos - $start ), '.' ) { $start += $i + 1 if $i != -1; } say substr $url, $start, $dot_pos - $start;
  41. my $url_re = qr{ ^ (https?|ftp) :// ([^/]+ \.)? ([^/]+)

    \. ([^/\.]+) / }msx; $url =~ /$url_re/ or warn "Unexpected URL format: $url" and next; say $3;
  42. $ find . -ls 48707625 0 drwxr-xr-x 38 bruce_pro staff

    1216 Jun 11 09:32 ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR
  43. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR my ( $inode, $blocks, $perms, $links, $owner, $group, $size, $mod_time, $path ) = split /\s+/, $line;
  44. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR ^^ ^^ my ( $inode, $blocks, $perms, $links, $owner, $group, $size, $mod_time, $path ) = split /\s+/, $line;
  45. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR my ( $inode, $blocks, $perms, $links, $owner, $group, $size, $mod_time, $path ) = split /\s+/, $line, 9;
  46. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR my ( $inode, $blocks, $perms, $links, $owner, $group, $size, $mod_time, $path ) = split /\s+/, $line;
  47. my @F = split /\s+/, $line, 9; shift @F if

    $F[0] eq ''; my ( $inode, $blocks, $perms, $links, $owner, $group, $size, $mod_time, $path ) = @F;
  48. my @F = split /\s+/, $line, 9; if ( $F[0]

    eq '' ) { ( my $empty, @F ) = split /\s+/, $line, 10; } my ( $inode, $blocks, $perms, $links, $owner, $group, $size, $mod_time, $path ) = @F;
  49. $line =~ s{^\s+}{}; my ( $inode, $blocks, $perms, $links, $owner,

    $group, $size, $mod_time, $path ) = split /\s+/, $line, 9;
  50. $line =~ s{^\s+}{}; my ( $inode, $blocks, $perms, $links, $owner,

    $group, $size, $mod_time, $path ) = split /\s+/, $line, 9;
  51. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR $line =~ s{^\s+}{}; my ( $inode, $blocks, $perms, $links, $owner, $group, $size, $mod_time, $path ) = split /\s+/, $line, 9;
  52. $line =~ s{^\s+}{}; my ( $inode, $blocks, $perms, $links, $owner,

    $group, $size, $mod_time, $path ) = split /\s+/, $line, 9; my ( $year, $hour, $minute ); if ( substr($mod_time, 2, 1) eq ':' ) { ($hour, $minute) = split ':', $mod_time; } else { $year = $mod_time }
  53. $line =~ s{^\s+}{}; my ( $inode, $blocks, $perms, $links, $owner,

    $group, $size, $mod_time, $path ) = split /\s+/, $line, 9; my ( $year, $hour, $minute ); if ( substr($mod_time, 2, 1) eq ':' ) { ($hour, $minute) = split ':', $mod_time; } else { $year = $mod_time }
  54. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff also 2336 Jul 31 2019 ./SS_20190828 - HPMoR $line =~ s{^\s+}{}; my ( $inode, $blocks, $perms, $links, $owner, $group, $size, $mod_time, $path ) = split /\s+/, $line, 9; my ( $year, $hour, $minute ); if ( substr($mod_time, 2, 1) eq ':' ) { ($hour, $minute) = split ':', $mod_time; } else { $year = $mod_time }
  55. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR my $mod_re = qr{ (?<MOD_TIME_MONTH> Jan|Feb|Mar|Apr|May|Jun| Jul|Aug|Sep|Oct|Nov|Dec) [ ]{1,2} (?<MOD_TIME_DAY> [1-3]?\d) [ ] (?: (?<MOD_TIME_HHMM> [0-2]\d:[0-5]\d) | [ ](?<MOD_TIME_YEAR> \d\d\d\d) ) }msx;
  56. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR my $mod_re = qr{ (?<MOD_TIME_MONTH> Jan|Feb|Mar|Apr|May|Jun| Jul|Aug|Sep|Oct|Nov|Dec) [ ]{1,2} (?<MOD_TIME_DAY> [1-3]?\d) [ ] (?: (?<MOD_TIME_HHMM> [0-2]\d:[0-5]\d) | [ ](?<MOD_TIME_YEAR> \d\d\d\d) ) }msx;
  57. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR my $mod_re = qr{ (?<MOD_TIME_MONTH> Jan|Feb|Mar|Apr|May|Jun| Jul|Aug|Sep|Oct|Nov|Dec) [ ]{1,2} (?<MOD_TIME_DAY> [1-3]?\d) [ ] (?: (?<MOD_TIME_HHMM> [0-2]\d:[0-5]\d) | [ ](?<MOD_TIME_YEAR> \d\d\d\d) ) }msx;
  58. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR my $mod_re = qr{ (?<MOD_TIME_MONTH> Jan|Feb|Mar|Apr|May|Jun| Jul|Aug|Sep|Oct|Nov|Dec) [ ]{1,2} (?<MOD_TIME_DAY> [1-3]?\d) [ ] (?: (?<MOD_TIME_HHMM> [0-2]\d:[0-5]\d) | [ ](?<MOD_TIME_YEAR> \d\d\d\d) ) }msx;
  59. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR my $mod_re = qr{ (?<MOD_TIME_MONTH> Jan|Feb|Mar|Apr|May|Jun| Jul|Aug|Sep|Oct|Nov|Dec) [ ]{1,2} (?<MOD_TIME_DAY> [1-3]?\d) [ ] (?: (?<MOD_TIME_HHMM> [0-2]\d:[0-5]\d) | [ ](?<MOD_TIME_YEAR> \d\d\d\d) ) }msx;
  60. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR my $mod_re = qr{ (?<MOD_TIME_MONTH> Jan|Feb|Mar|Apr|May|Jun| Jul|Aug|Sep|Oct|Nov|Dec) [ ]{1,2} (?<MOD_TIME_DAY> [1-3]?\d) [ ] (?: (?<MOD_TIME_HHMM> [0-2]\d:[0-5]\d) | [ ](?<MOD_TIME_YEAR> \d\d\d\d) ) }msx;
  61. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  62. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  63. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  64. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  65. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  66. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  67. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  68. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  69. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  70. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  71. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  72. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  73. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  74. grammar find_dash_ls { rule TOP { \s* <inode> <blocks> <perms>

    <links> <owner> <group> <size> <modified> <path> } token inode { \d+ } token blocks { \d+ } token perms { <[\-dlcbsp]> <[\-r]> <[\-w]> <[\-xs]> <[\-r]> <[\-w]> <[\-xsS]> <[\-r]> <[\-w]> <[\-xtT]> } token links { \d+ } token owner { \w+ } token group { \w+ } token size { \d+ } token path { \S .* } constant @days = 1 .. 31; constant @months = <Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec>; token month { @months } token day { @days } token hour { <[0..2]> \d } token minute { <[0..5]> \d } token year { \d\d\d\d } rule modified { <month> <day> [ <year> || <hour>':'<minute> ] } }
  75. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  76. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff also 2336 Jul 31 2019 ./SS_20190828 - HPMoR
  77. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff also 2336 Jul 31 2019 ./SS_20190828 - HPMoR $line =~ /$find_ls_re/ or die "Failed to match '$line'";
  78. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff also 2336 Jul 31 2019 ./SS_20190828 - HPMoR #use re 'debug'; # Uncomment to get a trace. $line =~ /$find_ls_re/ or die "Failed to match '$line'"; 18 < 0 > <drwxr-xr-x> | 5| 20:OPEN3 'PERMS'(22) 18 < 0 > <drwxr-xr-x> | 5| 22:ANYOF[\-bcdlps](33) 19 < 0 d> <rwxr-xr-x > | 5| 33:ANYOF[\-r](44)
  79. Regex Resources • http://regex.info/book.html
 The O'Reilly book "Mastering Regular Expressions"


    - Perl, Python, and more…even covers .NET ! • https://www.rubyguides.com/2015/06/ruby-regex/
 Ruby-oriented • https://towardsdatascience.com/regular-expressions- explained-c9bce508e672
 Python-oriented • https://learning.oreilly.com/videos/understanding-regular- expressions/9781491996300
 Damian Conway's 5-hour video tutorial (subscriber-only)
  80. AoWhat? • AoA Array of Arrays (or List of Lists)

    • AoH Array of Hashes • HoA Hash of Arrays • HoH Hash of Hashes • AoHoHoA Array of Hashes of Hashes of Arrays
  81. AoHoHoA • A Hospital • has multiple numbered floors •

    with multiple wards/units per floor •
  82. AoHoHoA • A Hospital • has multiple numbered floors •

    with multiple wards/units per floor • with multiple nurses per ward •
  83. AoHoHoA • A Hospital • has multiple numbered floors •

    with multiple wards/units per floor • with multiple nurses per ward • with a list of patients, in room-number order •
  84. AoHoHoA • A Hospital • has multiple numbered floors •

    with multiple wards/units per floor • with multiple nurses per ward • with a list of patients, in room-number order • $hospital->[3]{'MedSurg'}{'Sarah'}[4];
  85. for my $i ( keys @name ) { say join

    ':', $name[$i], $addr[$i], $city[$i]; }
  86. for my $i ( reverse keys @name ) { next

    if $name[$i] !~ /herman/i; splice @name, $i, 1; splice @addr, $i, 1; splice @city, $i, 1; }
  87. for my $i ( reverse keys @name ) { next

    if $name[$i] !~ /herman/i; splice @name, $i, 1; splice @addr, $i, 1; splice @city, $i, 1; }
  88. my @to_keep = grep { $name[$i] !~ /herman/i } keys

    @name; @name = @name[@to_keep]; @addr = @addr[@to_keep]; @city = @city[@to_keep];
  89. my @to_keep = grep { $name[$i] !~ /herman/i } keys

    @name; @name = @name[@to_keep]; @addr = @addr[@to_keep]; @city = @city[@to_keep];
  90. https://perldoc.perl.org/functions/keys.html keys HASH keys ARRAY Called in list context, returns

    a list consisting of all the keys of the named hash, or in Perl 5.12 or later only, the indices of an array.
  91. my @people_AoH = ( { NAME => 'Herman Munster', ADDR

    => '1313 Mockingbird Lane', CITY => 'Mockingbird Heights', }, { NAME => 'Sherlock Holmes', ADDR => '221b Baker Street', CITY => 'London', }, # … many more people loaded from the file … );
  92. my @people_AoH = ( { NAME => 'Herman Munster', ADDR

    => '1313 Mockingbird Lane', CITY => 'Mockingbird Heights', }, { NAME => 'Sherlock Holmes', ADDR => '221b Baker Street', CITY => 'London', }, # … many more people loaded from the file … );
  93. my @people_AoH = ( { NAME => 'Herman Munster', ADDR

    => '1313 Mockingbird Lane', CITY => 'Mockingbird Heights', }, { NAME => 'Sherlock Holmes', ADDR => '221b Baker Street', CITY => 'London', }, # … many more people loaded from the file … );
  94. my @to_keep = grep { $name[$i] !~ /herman/i } keys

    @name; @name = @name[@to_keep]; @addr = @addr[@to_keep]; @city = @city[@to_keep];
  95. half-blood prince Harry Potter "He accused me of being Dumbledore's

    man through and through. ... I told him I was." HPMoR Narrator And from that day onward, no matter what Hermione tried to tell anyone, it would be an accepted legend of Hogwarts that Harry Potter could make absolutely anything happen by snapping his fingers. HPMoR Narrator Either Harry Potter had thought of a lot of very good ideas very fast, or for some unimaginable reason he'd already spent a lot of time working out how to fight underwater. HPMoR Harry Potter Harry's Internal Critic promptly awarded him the All-Time Award for the Worst Acting in the History of Ever. wossnames thingummy The Potter M:"...they called me...do you know what they called me?" Flame and air? "The Sorting Hat called you brilliant," he said. "That's a Ravenclaw tie you're wearing. Where's your wand, Myrtle Smith?" wossnames thingummy The Potter "90% of a cup of coffee is the smell. And this is 200% coffee, Jamaican Blue Mountain." And it was working: Myrtle Smith was now available in colour. seventh horcrux Potter In retrospect, I have absolutely no idea how Horcruxes work. seventh horcrux Potter "Gryffindor." S.Hat: "But you would do so well in Slytherin" "I've already done well in Slytherin. Now I want to do well, in *Gryffindor*." seventh horcrux Potter "Hermione," I said sweetly, "Do you want to be friends?" Merlin bless the simple interactions of children. HPMoR Harry Potter (to Umbridge) "I make you this one offer," said the Boy-Who-Lived. "I never learn that you've been interfering with me or any of mine. And you never find out why the unkillable soul-eating monster is scared of me. Now sit down and shut up." HPMoR Greengrass The enemy is attacking Hogwarts students... And Hogwarts, is going to *fight* *back*.”
  96. HPMoR Harry Potter Harry's Internal Critic promptly awarded him the

    All-Time Award for the Worst Acting in the History of Ever.
  97. half-blood prince Harry Potter "He accused me of being Dumbledore's

    man through and through. ... I told him I was." HPMoR Narrator And from that day onward, no matter what Hermione tried to tell anyone, it would be an accepted legend of Hogwarts that Harry Potter could make absolutely anything happen by snapping his fingers. HPMoR Narrator Either Harry Potter had thought of a lot of very good ideas very fast, or for some unimaginable reason he'd already spent a lot of time working out how to fight underwater. HPMoR Harry Potter Harry's Internal Critic promptly awarded him the All-Time Award for the Worst Acting in the History of Ever. wossnames thingummy The Potter M:"...they called me...do you know what they called me?" Flame and air? "The Sorting Hat called you brilliant," he said. "That's a Ravenclaw tie you're wearing. Where's your wand, Myrtle Smith?" wossnames thingummy The Potter "90% of a cup of coffee is the smell. And this is 200% coffee, Jamaican Blue Mountain." And it was working: Myrtle Smith was now available in colour. seventh horcrux Potter In retrospect, I have absolutely no idea how Horcruxes work. seventh horcrux Potter "Gryffindor." S.Hat: "But you would do so well in Slytherin" "I've already done well in Slytherin. Now I want to do well, in *Gryffindor*." seventh horcrux Potter "Hermione," I said sweetly, "Do you want to be friends?" Merlin bless the simple interactions of children. HPMoR Harry Potter (to Umbridge) "I make you this one offer," said the Boy-Who-Lived. "I never learn that you've been interfering with me or any of mine. And you never find out why the unkillable soul-eating monster is scared of me. Now sit down and shut up." HPMoR Greengrass The enemy is attacking Hogwarts students... And Hogwarts, is going to *fight* *back*.”
  98. HPMoR Greengrass The enemy is attacking Hogwarts students... HPMoR Harry

    Potter Harry's Internal Critic promptly awarded him the All-Time Award HPMoR Harry Potter (to Umbridge) "I make you this one offer," said the Boy-Who-Lived. HPMoR Narrator And from that day onward, no matter what Hermione tried to tell anyone, HPMoR Narrator Either Harry Potter had thought of a lot of very good ideas very fast, half-blood prince Harry Potter "He accused me of being Dumbledore's man through and through. seventh horcrux Potter In retrospect, I have absolutely no idea how Horcruxes work. seventh horcrux Potter "Gryffindor." seventh horcrux Potter "Hermione," I said sweetly, "Do you want to be friends?" wossnames thingummyThe Potter M:"...they called me...do you know what they called me?" wossnames thingummyThe Potter "90% of a cup of coffee is the smell. And this is 200% coffee, Jamaican Blue Mountain."
  99. HPMoR Harry Potter Harry's Internal Critic promptly awarded him the

    All-Time Award for the Worst Acting in the History of Ever. Book Name Character Name One or more lines describing the Moment
  100. Book Character First line that described the Moment Tabs HPMoR

    Harry Potter Harry's Internal Critic promptly awarded h…
  101. HPMoR Harry Potter Harry's Internal Critic promptly awarded him the

    All-Time Award for the Worst Acting in the History of Ever. Book Name Character Name One or more lines describing the Moment Book Character First line that described the Moment Tabs HPMoR Harry Potter Harry's Internal Critic promptly awarded h…
  102. # After writing @lines out to tempfile: @lines = `sort

    -nr -k 5,7 tempfile`; # …versus keeping it in Perl: @lines = { $b->[4] <=> $a->[4] or $b->[6] <=> $a->[6] } @lines;
  103. Remaining data looks like… • • • multiple (kept-in-order) Moments

    • multiple (kept-in-order) lines of text per Moment.
  104. HoHoAoA • clustered by Book • clustered by Character •

    multiple (kept-in-order) Moments • multiple (kept-in-order) lines of text per Moment.
  105. use 5.010; $/ = ''; # Paragraph mode my %book_char_lines_HoHoAoA;

    while (<>) { chomp; my ( $book, $character, @lines ) = split "\n"; push @{ $book_char_lines_HoHoAoA{$book}{$character} }, [@lines]; } for my $book ( sort keys %book_char_lines_HoHoAoA ) { for my $char ( sort keys %{ $book_char_lines_HoHoAoA{$book} } ) { for my $aref ( @{ $book_char_lines_HoHoAoA{$book}{$char} } ) { say join "\t", $book, $char, $aref->[0]; } } say ''; }
  106. use 5.024; $/ = ''; # Paragraph mode my %book_char_lines_HoHoAoA;

    while (<>) { chomp; my ( $book, $character, @lines ) = split "\n"; push $book_char_lines_HoHoAoA{$book}{$character}->@*, [@lines]; } for my $book ( sort keys %book_char_lines_HoHoAoA ) { for my $char ( sort keys $book_char_lines_HoHoAoA{$book}->%* ) { for my $aref ( $book_char_lines_HoHoAoA{$book}{$char}->@* ) { say join "\t", $book, $char, $aref->[0]; } } say ''; }
  107. my %book_char_lines_HoHoAoA; for 'cmoa.txt'.IO.slurp.split("\n\n") { my ( $book, $character, @lines

    ) = .split: "\n"; push %book_char_lines_HoHoAoA{$book}{$character}, [@lines]; } for %book_char_lines_HoHoAoA .keys.sort -> $book { for %book_char_lines_HoHoAoA{$book} .keys.sort -> $char { for %book_char_lines_HoHoAoA{$book}{$char}.list -> @lines { say join "\t", $book, $char, @lines[0]; } } say ''; }
  108. my %book_char_lines_HoHoAoA; for 'cmoa.txt'.IO.slurp.split("\n\n") { my ( $book, $character, @lines

    ) = .split: "\n"; push %book_char_lines_HoHoAoA{$book}{$character}, [@lines]; } # More efficient. Less readable, or more readable? for %book_char_lines_HoHoAoA.sort -> (:key($book), :value(%book_hash )) { for %book_hash\ .sort -> (:key($char), :value(@char_array)) { for @char_array -> @lines { say join "\t", $book, $char, @lines[0]; } } say ''; }
  109. Deep DS Resources • https://perldoc.perl.org/perldsc.html
 Perl: see also: perllol, perlref,

    and perlreftut.html • https://www.perlmonks.org/?node_id=172833
 Auto-Vivification Explanation • https://web.stanford.edu/class/archive/cs/cs106ap/ cs106ap.1198/lectures/15-NestedCollections/15- Nested_Data_Structures.pdf
 Python: Nested DS starts at page 70
  110. Trees in the wild • Objects (Nodes) • Direction (Edges)

    • Hierarchy (Parent/Child) • Order (Siblings) • Navigation (Methods)
  111. <html><head><title>Doc 1</title></head> <body> Stuff <hr> 2000-08-17 </body></html> <html> <head> <title>

    Doc 1 </title> </head> <body> Stuff <hr> 2000-08-17 </body> </html>
  112. <html><head><title>Doc 1</title></head> <body> Stuff <hr> 2000-08-17 </body></html> • html •

    head • title • "Doc 1" • body • "Stuff" • hr • "2000-08-17" <html> <head> <title> Doc 1 </title> </head> <body> Stuff <hr> 2000-08-17 </body> </html>
  113. <html><head><title>Doc 1</title></head> <body> Stuff <hr> 2000-08-17 </body></html> • html •

    head • title • "Doc 1" • body • "Stuff" • hr • "2000-08-17" html / \ head body / / | \ title "Stuff" hr "2000-08-17" | "Doc 1" <html> <head> <title> Doc 1 </title> </head> <body> Stuff <hr> 2000-08-17 </body> </html>
  114. html / \ head body / / | \ title

    "Stuff" hr "2000-08-17" | "Doc 1"
  115. html / \ head body / / | \ title

    "Stuff" hr "2000-08-17" | "Doc 1"
  116. <div class="condition-wind small-6 medium-12 columns"> <div class="wind-compass-wrap"> <div class="wind-compass" style="transform:rotate(313deg);">

    <div class="dial"> <div class="arrow-direction"> </div> </div> </div> <div class="wind-north"> N </div> <strong> 0 </strong> </div> <p class="ng-star-inserted"> Gusts <strong> <span class="test-false wu-unit wu-unit-speed"> <span class="wu-value wu-value-to" style="color:;"> 2 </span> <span class="wu-label"> <span class="ng-star-inserted"> mph </span> </span> </span> </strong> </div> </div>
  117. <div class="condition-wind small-6 medium-12 columns"> <div class="wind-compass-wrap"> <div class="wind-compass" style="transform:rotate(313deg);">

    <div class="dial"> <div class="arrow-direction"> </div> </div> </div> <div class="wind-north"> N </div> <strong> 0 </strong> </div> <p class="ng-star-inserted"> Gusts <strong> <span class="test-false wu-unit wu-unit-speed"> <span class="wu-value wu-value-to" style="color:;"> 2 </span> <span class="wu-label"> <span class="ng-star-inserted"> mph </span> </span> </span> </strong> </div> <!-- XXX should close Paragraph not Div --> </div>
  118. <div class="condition-wind small-6 medium-12 columns"> <div class="wind-compass-wrap"> <div class="wind-compass" style="transform:rotate(313deg);">

    <div class="dial"> <div class="arrow-direction"> </div> </div> </div> <div class="wind-north"> N </div> <strong> 0 </strong> </div> <p class="ng-star-inserted"> Gusts <strong> <span class="test-false wu-unit wu-unit-speed"> <span class="wu-value wu-value-to" style="color:;"> 2 </span> <span class="wu-label"> <span class="ng-star-inserted"> mph </span> </span> </span> </strong> </div> </div>
  119. <div class="condition-wind small-6 medium-12 columns"> <div class="wind-compass-wrap"> <div class="wind-compass" style="transform:rotate(313deg);">

    <div class="dial"> <div class="arrow-direction"> </div> </div> </div> <div class="wind-north"> N </div> <strong> 0 </strong> </div> <p class="ng-star-inserted"> Gusts <strong> <span class="test-false wu-unit wu-unit-speed"> <span class="wu-value wu-value-to" style="color:;"> 2 </span> <span class="wu-label"> <span class="ng-star-inserted"> mph </span> </span> </span> </strong> </div> </div>
  120. <div class="condition-wind small-6 medium-12 columns"> <div class="wind-compass-wrap"> <div class="wind-compass" style="transform:rotate(313deg);">

    <div class="dial"> <div class="arrow-direction"> </div> </div> </div> <div class="wind-north"> N </div> <strong> 0 </strong> </div> <p class="ng-star-inserted"> Gusts <strong> <span class="test-false wu-unit wu-unit-speed"> <span class="wu-value wu-value-to" style="color:;"> 2 </span> <span class="wu-label"> <span class="ng-star-inserted"> mph </span> </span> </span> </strong> </div>
  121. my $from_div_to_strong = / '<div ' <-[>]>* 'class="condition-wind' <-[>]>* '>'

    [ .+? '</div>' ] ** 6 /; my $tag_or_nbsp = / '<' '/'? <-[>]>+ '>' | '&nbsp;' /; say 'weather.html'.IO.slurp\ .match( $from_div_to_strong )\ .subst( $tag_or_nbsp, ' ', :global )\ .trans( ' ' => ' ', :squash )\ .trim;
  122. use Modern::Perl; use HTML::TokeParser; my $p = HTML::TokeParser->new('weather.html') or die

    "Can't open: $!"; $p->empty_element_tags(1); my $in_wind = 0; my $level = 0; my @texts; # this_loop_is_on_the_next_slide(); s/&nbsp;/ /g for @texts; say @texts;
  123. while ( my $t = $p->get_token ) { my $type

    = $t->[0]; if ( $type eq 'S' ) { my ( $tag, $attr ) = @{$t}[1,2]; $in_wind = 1 if $tag eq 'div' and ($attr->{class} // '') =~ /^condition-wind /; $level++ if $in_wind; } elsif ( $type eq 'E' and $in_wind ) { my $tag = $t->[1]; $level--; $in_wind = 0 if $level == 0; } elsif ( $type eq 'T' and $in_wind ) { push @texts, $t->[1]; } }
  124. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  125. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  126. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  127. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  128. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  129. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  130. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  131. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  132. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  133. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  134. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  135. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  136. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  137. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  138. use Modern::Perl; use HTML::TreeBuilder; my $tree = HTML::TreeBuilder->new_from_file('weather.html') or die;

    my $wind = $tree->look_down( _tag => 'div', class => qr/^condition-wind /, );
  139. use Modern::Perl; use HTML::TreeBuilder; my $tree = HTML::TreeBuilder->new_from_file('weather.html') or die;

    my $wind = $tree->look_down( _tag => 'div', class => qr/^condition-wind /, ); $wind
  140. use Modern::Perl; use HTML::TreeBuilder; my $tree = HTML::TreeBuilder->new_from_file('weather.html') or die;

    my $wind = $tree->look_down( _tag => 'div', class => qr/^condition-wind /, ); $wind->as_trimmed_text
  141. use Modern::Perl; use HTML::TreeBuilder; my $tree = HTML::TreeBuilder->new_from_file('weather.html') or die;

    my $wind = $tree->look_down( _tag => 'div', class => qr/^condition-wind /, ); $wind->as_trimmed_text( extra_chars => '\xA0' );
  142. use Modern::Perl; use HTML::TreeBuilder; my $tree = HTML::TreeBuilder->new_from_file('weather.html') or die;

    my $wind = $tree->look_down( _tag => 'div', class => qr/^condition-wind /, ); say $wind->as_trimmed_text( extra_chars => '\xA0' );
  143. use Modern::Perl; use HTML::TreeBuilder; my $tree = HTML::TreeBuilder->new_from_file('weather.html') or die;

    my $wind = $tree->look_down( _tag => 'div', class => qr/^condition-wind /, ); say $wind->as_trimmed_text( extra_chars => '\xA0' );
  144. use Modern::Perl; use HTML::TreeBuilder; my $tree = HTML::TreeBuilder->new_from_file('weather.html') or die;

    my $wind = $tree->look_down( _tag => 'div', class => qr/^condition-wind /, ); say $wind->as_trimmed_text( extra_chars => '\xA0' );
  145. Tree Resources • https://metacpan.org/pod/HTML::Tree::Scanning • https://www.perlmonks.org/?node_id=153259
 Introduction to the more

    generic Tree::DAG_Node • https://perlhacks.com/2014/04/data-munging-perl/
 Book: Data Munging with Perl
 Now freely available for download! • https://docs.python-guide.org/scenarios/scrape/ • https://stackoverflow.com/questions/14172028/html- parse-tree-using-python-2-7
 Answers also for Javascript
  146. Q&A

  147. Static Analysis • Lint (and other linters) • Scanners for

    style issues and common bugs: • Perl::Critic • RubyCritic • Rubocop • Prospector
 • Scanners for security issues: • Brakeman • Bandit • Assisted Refactoring • PyCharm • ReSharper
  148. Static Analysis • Lint (and other linters) • Scanners for

    style issues and common bugs: • Perl::Critic • RubyCritic • Rubocop • Prospector
 • Scanners for security issues: • Brakeman • Bandit • Assisted Refactoring • PyCharm • ReSharper • Automated Refactoring • Blue Tiger
  149. Copyright Information: Images • Camelia • (c) 2009 by Larry

    Wall
 http://github.com/perl6/mu/raw/master/misc/ camelia.txt • Regular Expressions • © Randall Munroe
 https://xkcd.com/208/ • Brantley Grayson Wolters • (c) 2011 by his mother, Amy Wolters
 Util 1.1.1 is my eldest grandchild
 (in RCS numbering)
  150. Copyright Information: This Talk This work is licensed under a

    Creative Commons Attribution 4.0 International License. CC BY https://creativecommons.org/licenses/by/4.0/ (email me for the original Apple Keynote .key file)
  151. History • v 1.01 2020-06-12
 Presented at Southeast LinuxFest
 60

    minutes with Q&A • v 1.02 2020-06-24
 Presented at Perl & Raku Conference in the Cloud
 50 minutes with Q&A
  152. Key knowledge (Perl) • List::Util first max min sum all

    any none uniq shuffle • List::UtilsBy max_by min_by count_by • perlfunc Perl Functions by Category • Test::Tutorial Intro to Automated Testing • Devel::Cover Shows code that lacks testing • Benchmark Performance comparisons • Devel::NYTProf Perfomance profiler
  153. my @to_keep = grep { $name[$i] !~ /herman/i } keys

    @name; @name = @name[@to_keep]; @addr = @addr[@to_keep]; @city = @city[@to_keep];
  154. <div class="condition-wind small-6 medium-12 columns"> <div class="wind-compass-wrap"> <div class="wind-compass" style="transform:rotate(313deg);">

    <div class="dial"> <div class="arrow-direction"> <div class="wind-north"> "N" <strong> "0" <p class="ng-star-inserted"> " Gusts " <strong> <span class="test-false wu-unit wu-unit-speed ng-star-inserted"> <span class="wu-value wu-value-to" style="color:;"> "2" <span class="wu-label"> <span class="ng-star-inserted"> "mph"
  155. Static Analysis • Lint (and other linters) • Scanners for

    style issues and common bugs: • Perl::Critic • RubyCritic • Rubocop • Prospector
 • Scanners for security issues: • Brakeman • Bandit • Assisted Refactoring • PyCharm • ReSharper • Automated Refactoring • Blue Tiger
  156. • TODO: • Add Regex history • Credit Damian on

    P6 Regex • Review all long slide text, and split