Refactoring_and_Readability_-_CRHS_-_Perl_Conference_20200624.pdf

E8c82943a6b1f2acdf4b7fbf4538e1fd?s=47 Bruce Gray
June 24, 2020
17

 Refactoring_and_Readability_-_CRHS_-_Perl_Conference_20200624.pdf

E8c82943a6b1f2acdf4b7fbf4538e1fd?s=128

Bruce Gray

June 24, 2020
Tweet

Transcript

  1. Refactoring and Readability:
 Crouching Regex, Hidden Structures
 Perl & Raku

    Conference in the Cloud
 
 2020-06-24
  2. Refactoring and Readability 1.5:
 The Pre-Sequel http://speakerdeck.com/util
 
 <bruce.gray@acm.org> <<<

    >>> >>> <<<
  3. Refactoring and Readability 2:
 Crouching Regex, Hidden Structures

  4. Or

  5. Refactoring and Readability 1.5 :
 The Pre-Sequel

  6. /me
 
 Bruce Gray
 
 'Util'

  7. Perl 6 ==> Raku

  8. Face Off ( Rubin's Vase )

  9. Fun for the whole family • Regular Expressions (Regex) •

    Multi-dimensional Data Structures • Trees •
  10. Fun for the whole family • Regular Expressions (Regex) •

    Multi-dimensional Data Structures • Trees •
  11. Fun for the whole family • Regular Expressions (Regex) •

    Multi-dimensional Data Structures • Trees •
  12. Fun for the whole family • Regular Expressions (Regex) •

    Multi-dimensional Data Structures • Trees •
  13. Fun for the whole family • Regular Expressions (Regex) •

    Multi-dimensional Data Structures • Trees • Q & A + Bonus!!!
  14. Sinkholes
 in the
 space/code continuum or, how to you decide

    what to learn, when you cannot judge its value until *after* you learn it?
  15. Last time, on Dragonball Z

  16. Modes Fix Add Refactor

  17. 
 Forces Surrounding Code Code Easy to write Standard form

    Ease of change Easy to read Performance Boundaries of Responsibility
  18. 
 Forces Surrounding Code Code Easy to write Standard form

    Ease of change Easy to read Performance Boundaries of Responsibility
  19. Regular Expressions
 Regex

  20. –Jamie Zawinski “Some people, when confronted with a problem, think


    `I know, I'll use regular expressions.`
 Now they have two problems.”
  21. File Globbing Regex . Literal dot Any single character ?

    Any single character Quantifier: Once or none * Any string without '/' Quantifier: Zero or more + Quantifier: One or more {2,4} Alternation: '2' or '4' Quantifier: Two to four times (2|4) Alternation: '2' or '4'
  22. ack https://beyondgrep.com/ ag https://github.com/ggreer/the_silver_searcher rg https://blog.burntsushi.net/ripgrep/

  23. $ find . -not -type d ./SS_20200503/Screen Shot 2020-05-01 at

    4.45.30 PM.png ./SS_20200503/Screen Shot 2020-04-28 at 1.29.14 PM.png ./SS_20200503/Screen Shot 2020-04-24 at 8.08.38 PM.png ./SS_20200503/Screen Shot 2020-04-28 at 2.35.18 PM.png ./SS_20200527/Screen Shot 2020-05-03 at 7.40.14 PM.png …
  24. ./SS_20200503/Screen Shot 2020-05-01 at 4.45.30 PM.png ./SS_20200503/Screen Shot 2020-04-28 at

    1.29.14 PM.png ./SS_20200503/Screen Shot 2020-04-24 at 8.08.38 PM.png ./SS_20200503/Screen Shot 2020-04-28 at 2.35.18 PM.png ./SS_20200527/Screen Shot 2020-05-03 at 7.40.14 PM.png …
  25. ./SS_20200503/Screen Shot 2020-05-01 at 4.45.30 PM.png ./SS_20200503/Screen Shot 2020-04-28 at

    1.29.14 PM.png ./SS_20200503/Screen Shot 2020-04-24 at 8.08.38 PM.png ./SS_20200503/Screen Shot 2020-04-28 at 2.35.18 PM.png ./SS_20200527/Screen Shot 2020-05-03 at 7.40.14 PM.png …
  26. ./SS_20200503/Screen Shot 2020-05-01 at 4.45.30 PM.png

  27. ./SS_20200503/Screen Shot 2020-05-01 at 4.45.30 PM.png

  28. ./SS_20200503/Screen Shot 2020-05-01 at 4.45.30 PM.png

  29. Screen Shot 2020-05-01 at 4.45.30 PM.png

  30. Screen Shot 2020-05-01 at 4.45.30 PM.png

  31. Screen Shot 2020-05-01 at 4.45.30 PM.png

  32. Screen Shot 2020-05-01 at 4.45.30 PM.png ^ ^ ^

  33. Screen Shot 2020-05-01 at 4.45.30 PM.png

  34. Screen Shot 2020-\d5-01 at 4.45.30 PM.png

  35. Screen Shot 2020-\d\d-01 at 4.45.30 PM.png

  36. Screen Shot 2020-\d\d-\d1 at 4.45.30 PM.png

  37. Screen Shot 2020-\d\d-\d\d at 4.45.30 PM.png

  38. Screen Shot 2020-\d\d-\d\d at \d.45.30 PM.png

  39. Screen Shot 2020-\d\d-\d\d at \d\.45.30 PM.png

  40. Screen Shot 2020-\d\d-\d\d at \d\.\d5.30 PM.png

  41. Screen Shot 2020-\d\d-\d\d at \d\.\d\d.30 PM.png

  42. Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.30 PM.png

  43. Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d0 PM.png

  44. Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM.png

  45. Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png

  46. Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$

  47. Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$

  48. Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$

  49. Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$

  50. 'Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$'

  51. $ find . -not -type d | ack -v \

    'Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$'
  52. $ find . -not -type d | ack -v \

    'Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$'
  53. $ find . -not -type d | ack -v \

    'Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$'
  54. $ find . -not -type d | ack -v \

    'Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$'
  55. $ find . -not -type d | ack -v \

    'Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$'
  56. $ find . -not -type d | ack -v \

    'Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$ '
  57. $ find . -not -type d | ack -v \

    'Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$'
  58. $ find . -not -type d | ack -v \

    'Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$' ./SS_20190828/Screen Shot 2019-08-10 at 9.28.52 PM.png ./SS_20190828/Screen Shot 2019-08-15 at 8.59.19 PM.png …
  59. $ find . -not -type d | ack -v \

    'Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$' ./SS_20190828/Screen Shot 2019-08-10 at 9.28.52 PM.png ./SS_20190828/Screen Shot 2019-08-15 at 8.59.19 PM.png …
  60. $ find . -not -type d | ack -v \

    'Screen Shot 2020-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$' ./SS_20190828/Screen Shot 2019-08-10 at 9.28.52 PM.png ./SS_20190828/Screen Shot 2019-08-15 at 8.59.19 PM.png …
  61. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$'
  62. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$' ./SS_20200503/Screen Shot 2020-04-27 at 7.14.42 AM.png ./SS_20190828/Screen Shot 2019-08-21 at 8.04.16 AM.png …
  63. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$' ./SS_20200503/Screen Shot 2020-04-27 at 7.14.42 AM.png ./SS_20190828/Screen Shot 2019-08-21 at 8.04.16 AM.png …
  64. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d\.\d\d\.\d\d PM\.png$' ./SS_20200503/Screen Shot 2020-04-27 at 7.14.42 AM.png ./SS_20190828/Screen Shot 2019-08-21 at 8.04.16 AM.png …
  65. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d\.\d\d\.\d\d [AP]M\.png$' ./SS_20200503/Screen Shot 2020-04-27 at 7.14.42 AM.png ./SS_20190828/Screen Shot 2019-08-21 at 8.04.16 AM.png …
  66. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d\.\d\d\.\d\d [AP]M\.png$'
  67. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d\.\d\d\.\d\d [AP]M\.png$' ./SS_20200503/Screen Shot 2020-04-29 at 11.52.35 PM.png ./SS_20190828/Screen Shot 2019-07-31 at 10.14.57 AM.png
  68. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d\.\d\d\.\d\d [AP]M\.png$' ./SS_20200503/Screen Shot 2020-04-29 at 11.52.35 PM.png ./SS_20190828/Screen Shot 2019-07-31 at 10.14.57 AM.png
  69. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d\.\d\d\.\d\d [AP]M\.png$' ./SS_20200503/Screen Shot 2020-04-29 at 11.52.35 PM.png ./SS_20190828/Screen Shot 2019-07-31 at 10.14.57 AM.png
  70. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d{}\.\d\d\.\d\d [AP]M\.png$' ./SS_20200503/Screen Shot 2020-04-29 at 11.52.35 PM.png ./SS_20190828/Screen Shot 2019-07-31 at 10.14.57 AM.png
  71. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d{}\.\d\d\.\d\d [AP]M\.png$' ./SS_20200503/Screen Shot 2020-04-29 at 11.52.35 PM.png ./SS_20190828/Screen Shot 2019-07-31 at 10.14.57 AM.png
  72. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d{1,2}\.\d\d\.\d\d [AP]M\.png$' ./SS_20200503/Screen Shot 2020-04-29 at 11.52.35 PM.png ./SS_20190828/Screen Shot 2019-07-31 at 10.14.57 AM.png
  73. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d{1,2}\.\d\d\.\d\d [AP]M\.png$'
  74. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d{1,2}\.\d\d\.\d\d [AP]M\.png$' ./SS_20200111/color_wheel_2.numbers ./SS_20200111/color_wheel_1.numbers ./SS_20200131/RenFlorenceWalk.mp3 ./SS_20200131/spa-barcelona-city.mp3 ./SS_20200131/six_flags_20190112.txt ./SS_20190716/c_04.png ./SS_20190716/c_77_edge.png
  75. https://xkcd.com/208/

  76. None
  77. None
  78. Wait, forgot to escape a space. Wheeeeee[taptaptap]eeeeee.

  79. http://regex101.com/

  80. https://regexper.com/

  81. $ find . -not -type d | ack -v \

    'Screen Shot 20\d\d-\d\d-\d\d at \d{1,2}\.\d\d\.\d\d [AP]M\.png$' ./SS_20200111/color_wheel_2.numbers ./SS_20200111/color_wheel_1.numbers ./SS_20200131/RenFlorenceWalk.mp3 ./SS_20200131/spa-barcelona-city.mp3 ./SS_20200131/six_flags_20190112.txt ./SS_20190716/c_04.png ./SS_20190716/c_77_edge.png
  82. • Navigation • Extraction • Exploration • Validation •

  83. ftp://www13.warehouse.org/path/file-listing http://www.lexcorp.com/ http://www.umbrella.com/hive/plan_for_reopening

  84. my $end_of_prefix = -1; for my $scheme ( qw<https http

    ftp> ) { my $prefix = $scheme . '://'; my $p_len = length $prefix; if ( substr( $url, 0, $p_len ) eq $prefix ) { $end_of_prefix = $p_len; last; } } if ( $end_of_prefix == -1 ) { warn "Unexpected URL format: $url"; next; } my $slash_pos = index $url, '/', $end_of_prefix; if ( $slash_pos == -1 ) { warn "Unexpected URL format: $url"; next; } my $dot_pos = rindex $url, '.', $slash_pos; if ( $dot_pos == -1 ) { warn "Unexpected URL format: $url"; next; } my $start = $end_of_prefix; if ( my $i = index substr( $url, $start, $dot_pos - $start ), '.' ) { $start += $i + 1 if $i != -1; } say substr $url, $start, $dot_pos - $start;
  85. my $url_re = qr{ ^ (https?|ftp) :// ([^/]+ \.)? ([^/]+)

    \. ([^/\.]+) / }msx; $url =~ /$url_re/ or warn "Unexpected URL format: $url" and next; say $3;
  86. • Navigation • Extraction • Exploration • Validation •

  87. • Navigation • Extraction • Exploration • Validation • Understanding

  88. find . -ls

  89. $ find . -ls 48707625 0 drwxr-xr-x 38 bruce_pro staff

    1216 Jun 11 09:32 ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR
  90. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR my ( $inode, $blocks, $perms, $links, $owner, $group, $size, $mod_time, $path ) = split /\s+/, $line;
  91. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR ^^ ^^ my ( $inode, $blocks, $perms, $links, $owner, $group, $size, $mod_time, $path ) = split /\s+/, $line;
  92. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR my ( $inode, $blocks, $perms, $links, $owner, $group, $size, $mod_time, $path ) = split /\s+/, $line, 9;
  93. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR my ( $inode, $blocks, $perms, $links, $owner, $group, $size, $mod_time, $path ) = split /\s+/, $line;
  94. my @F = split /\s+/, $line, 9; shift @F if

    $F[0] eq ''; my ( $inode, $blocks, $perms, $links, $owner, $group, $size, $mod_time, $path ) = @F;
  95. my @F = split /\s+/, $line, 9; if ( $F[0]

    eq '' ) { ( my $empty, @F ) = split /\s+/, $line, 10; } my ( $inode, $blocks, $perms, $links, $owner, $group, $size, $mod_time, $path ) = @F;
  96. $line =~ s{^\s+}{}; my ( $inode, $blocks, $perms, $links, $owner,

    $group, $size, $mod_time, $path ) = split /\s+/, $line, 9;
  97. $line =~ s{^\s+}{}; my ( $inode, $blocks, $perms, $links, $owner,

    $group, $size, $mod_time, $path ) = split /\s+/, $line, 9;
  98. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR $line =~ s{^\s+}{}; my ( $inode, $blocks, $perms, $links, $owner, $group, $size, $mod_time, $path ) = split /\s+/, $line, 9;
  99. $line =~ s{^\s+}{}; my ( $inode, $blocks, $perms, $links, $owner,

    $group, $size, $mod_time, $path ) = split /\s+/, $line, 9; my ( $year, $hour, $minute ); if ( substr($mod_time, 2, 1) eq ':' ) { ($hour, $minute) = split ':', $mod_time; } else { $year = $mod_time }
  100. $line =~ s{^\s+}{}; my ( $inode, $blocks, $perms, $links, $owner,

    $group, $size, $mod_time, $path ) = split /\s+/, $line, 9; my ( $year, $hour, $minute ); if ( substr($mod_time, 2, 1) eq ':' ) { ($hour, $minute) = split ':', $mod_time; } else { $year = $mod_time }
  101. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff also 2336 Jul 31 2019 ./SS_20190828 - HPMoR $line =~ s{^\s+}{}; my ( $inode, $blocks, $perms, $links, $owner, $group, $size, $mod_time, $path ) = split /\s+/, $line, 9; my ( $year, $hour, $minute ); if ( substr($mod_time, 2, 1) eq ':' ) { ($hour, $minute) = split ':', $mod_time; } else { $year = $mod_time }
  102. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR my $mod_re = qr{ (?<MOD_TIME_MONTH> Jan|Feb|Mar|Apr|May|Jun| Jul|Aug|Sep|Oct|Nov|Dec) [ ]{1,2} (?<MOD_TIME_DAY> [1-3]?\d) [ ] (?: (?<MOD_TIME_HHMM> [0-2]\d:[0-5]\d) | [ ](?<MOD_TIME_YEAR> \d\d\d\d) ) }msx;
  103. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR my $mod_re = qr{ (?<MOD_TIME_MONTH> Jan|Feb|Mar|Apr|May|Jun| Jul|Aug|Sep|Oct|Nov|Dec) [ ]{1,2} (?<MOD_TIME_DAY> [1-3]?\d) [ ] (?: (?<MOD_TIME_HHMM> [0-2]\d:[0-5]\d) | [ ](?<MOD_TIME_YEAR> \d\d\d\d) ) }msx;
  104. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR my $mod_re = qr{ (?<MOD_TIME_MONTH> Jan|Feb|Mar|Apr|May|Jun| Jul|Aug|Sep|Oct|Nov|Dec) [ ]{1,2} (?<MOD_TIME_DAY> [1-3]?\d) [ ] (?: (?<MOD_TIME_HHMM> [0-2]\d:[0-5]\d) | [ ](?<MOD_TIME_YEAR> \d\d\d\d) ) }msx;
  105. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR my $mod_re = qr{ (?<MOD_TIME_MONTH> Jan|Feb|Mar|Apr|May|Jun| Jul|Aug|Sep|Oct|Nov|Dec) [ ]{1,2} (?<MOD_TIME_DAY> [1-3]?\d) [ ] (?: (?<MOD_TIME_HHMM> [0-2]\d:[0-5]\d) | [ ](?<MOD_TIME_YEAR> \d\d\d\d) ) }msx;
  106. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR my $mod_re = qr{ (?<MOD_TIME_MONTH> Jan|Feb|Mar|Apr|May|Jun| Jul|Aug|Sep|Oct|Nov|Dec) [ ]{1,2} (?<MOD_TIME_DAY> [1-3]?\d) [ ] (?: (?<MOD_TIME_HHMM> [0-2]\d:[0-5]\d) | [ ](?<MOD_TIME_YEAR> \d\d\d\d) ) }msx;
  107. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff 2336 Jul 31 2019 ./SS_20190828 - HPMoR my $mod_re = qr{ (?<MOD_TIME_MONTH> Jan|Feb|Mar|Apr|May|Jun| Jul|Aug|Sep|Oct|Nov|Dec) [ ]{1,2} (?<MOD_TIME_DAY> [1-3]?\d) [ ] (?: (?<MOD_TIME_HHMM> [0-2]\d:[0-5]\d) | [ ](?<MOD_TIME_YEAR> \d\d\d\d) ) }msx;
  108. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  109. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  110. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  111. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  112. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  113. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  114. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  115. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  116. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  117. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  118. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  119. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  120. $line =~ /$find_ls_re/ or die "Failed to match '$line'"; print

    $line if $+{SIZE} > 1024;
  121. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  122. grammar find_dash_ls { rule TOP { \s* <inode> <blocks> <perms>

    <links> <owner> <group> <size> <modified> <path> } token inode { \d+ } token blocks { \d+ } token perms { <[\-dlcbsp]> <[\-r]> <[\-w]> <[\-xs]> <[\-r]> <[\-w]> <[\-xsS]> <[\-r]> <[\-w]> <[\-xtT]> } token links { \d+ } token owner { \w+ } token group { \w+ } token size { \d+ } token path { \S .* } constant @days = 1 .. 31; constant @months = <Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec>; token month { @months } token day { @days } token hour { <[0..2]> \d } token minute { <[0..5]> \d } token year { \d\d\d\d } rule modified { <month> <day> [ <year> || <hour>':'<minute> ] } }
  123. my $re = qr{ \A \s* (?<INODE> \d+ ) \s+

    (?<BLOCKS> \d+ ) \s+ (?<PERMS> [-dlcbsp] [-r] [-w] [-xs] [-r] [-w] [-xsS] [-r] [-w] [-xtT] ) \s+ (?<LINKS> \d+ ) \s+ (?<OWNER> \w+ ) \s+ (?<GROUP> \w+ ) \s+ (?<SIZE> \d+ ) \s+ (?<MOD_TIME> $mod_re ) \s+ (?<PATH> .+? ) \s* \Z }msx;
  124. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff also 2336 Jul 31 2019 ./SS_20190828 - HPMoR
  125. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff also 2336 Jul 31 2019 ./SS_20190828 - HPMoR $line =~ /$find_ls_re/ or die "Failed to match '$line'";
  126. 48707625 0 drwxr-xr-x 38 bruce_pro staff 1216 Jun 11 09:32

    ./SS_20200503 37742599 0 drwxr-xr-x 73 bruce_pro staff also 2336 Jul 31 2019 ./SS_20190828 - HPMoR #use re 'debug'; # Uncomment to get a trace. $line =~ /$find_ls_re/ or die "Failed to match '$line'"; 18 < 0 > <drwxr-xr-x> | 5| 20:OPEN3 'PERMS'(22) 18 < 0 > <drwxr-xr-x> | 5| 22:ANYOF[\-bcdlps](33) 19 < 0 d> <rwxr-xr-x > | 5| 33:ANYOF[\-r](44)
  127. • Navigation • Extraction • Exploration • Validation • Understanding

  128. None
  129. None
  130. Regex Resources • http://regex.info/book.html
 The O'Reilly book "Mastering Regular Expressions"


    - Perl, Python, and more…even covers .NET ! • https://www.rubyguides.com/2015/06/ruby-regex/
 Ruby-oriented • https://towardsdatascience.com/regular-expressions- explained-c9bce508e672
 Python-oriented • https://learning.oreilly.com/videos/understanding-regular- expressions/9781491996300
 Damian Conway's 5-hour video tutorial (subscriber-only)
  131. Multi-Dimensional Data Structures

  132. Single-Dimensional Data Structures • Array: List, Vector, Stack, Queue…
 


    
 • Hash: Map, Record, Dictionary…
  133. Single-Dimensional Data Structures • Array; List, Vector, Stack, Queue…
 


    
 • Hash; Map, Record, Dictionary…
  134. AoWhat? • AoA Array of Arrays (or List of Lists)

    • AoH Array of Hashes • HoA Hash of Arrays • HoH Hash of Hashes • AoHoHoA Array of Hashes of Hashes of Arrays
  135. Atlanta.pm says: • Who could conceptualize more than 3 levels?

  136. Atlanta.pm says: • Who could conceptualize more than 3 levels?

  137. Atlanta.pm says: • Who could conceptualize more than 3 levels?

  138. Util 1.1.1

  139. AoHoHoA • A Hospital •

  140. AoHoHoA • A Hospital • has multiple numbered floors •

  141. AoHoHoA • A Hospital • has multiple numbered floors •

    with multiple wards/units per floor •
  142. AoHoHoA • A Hospital • has multiple numbered floors •

    with multiple wards/units per floor • with multiple nurses per ward •
  143. AoHoHoA • A Hospital • has multiple numbered floors •

    with multiple wards/units per floor • with multiple nurses per ward • with a list of patients, in room-number order •
  144. AoHoHoA • A Hospital • has multiple numbered floors •

    with multiple wards/units per floor • with multiple nurses per ward • with a list of patients, in room-number order • $hospital->[3]{'MedSurg'}{'Sarah'}[4];
  145. Herman Munster 1313 Mockingbird Lane Mockingbird Heights Sherlock Holmes 221b

    Baker Street London …
  146. @name = ( 'Herman Munster', 'Sherlock Holmes', # … );

    @addr = … @city = …
  147. for my $i ( keys @name ) { say join

    ':', $name[$i], $addr[$i], $city[$i]; }
  148. for my $i ( reverse keys @name ) { next

    if $name[$i] !~ /herman/i; splice @name, $i, 1; splice @addr, $i, 1; splice @city, $i, 1; }
  149. for my $i ( reverse keys @name ) { next

    if $name[$i] !~ /herman/i; splice @name, $i, 1; splice @addr, $i, 1; splice @city, $i, 1; }
  150. my @to_keep = grep { $name[$i] !~ /herman/i } keys

    @name; @name = @name[@to_keep]; @addr = @addr[@to_keep]; @city = @city[@to_keep];
  151. my @to_keep = grep { $name[$i] !~ /herman/i } keys

    @name; @name = @name[@to_keep]; @addr = @addr[@to_keep]; @city = @city[@to_keep];
  152. https://perldoc.perl.org/functions/keys.html keys HASH keys ARRAY Called in list context, returns

    a list consisting of all the keys of the named hash, or in Perl 5.12 or later only, the indices of an array.
  153. @name = ( 'Herman Munster', 'Sherlock Holmes', # … );

    @addr = … @city = …
  154. my @people_AoH = ( { NAME => 'Herman Munster', ADDR

    => '1313 Mockingbird Lane', CITY => 'Mockingbird Heights', }, { NAME => 'Sherlock Holmes', ADDR => '221b Baker Street', CITY => 'London', }, # … many more people loaded from the file … );
  155. my @people_AoH = ( { NAME => 'Herman Munster', ADDR

    => '1313 Mockingbird Lane', CITY => 'Mockingbird Heights', }, { NAME => 'Sherlock Holmes', ADDR => '221b Baker Street', CITY => 'London', }, # … many more people loaded from the file … );
  156. my @people_AoH = ( { NAME => 'Herman Munster', ADDR

    => '1313 Mockingbird Lane', CITY => 'Mockingbird Heights', }, { NAME => 'Sherlock Holmes', ADDR => '221b Baker Street', CITY => 'London', }, # … many more people loaded from the file … );
  157. my @to_keep = grep { $name[$i] !~ /herman/i } keys

    @name; @name = @name[@to_keep]; @addr = @addr[@to_keep]; @city = @city[@to_keep];
  158. @people_AoH = grep { $->{NAME} !~ /herman/i } @people_AoH;

  159. @people_AoH .= grep: *.<NAME> !~~ m:i/herman/;

  160. Harry Potter

  161. HPMoR: Harry Potter and the Methods of Rationality

  162. Seventh Horcrux

  163. Potter Who and the Wossname's Thingummy

  164. half-blood prince Harry Potter "He accused me of being Dumbledore's

    man through and through. ... I told him I was." HPMoR Narrator And from that day onward, no matter what Hermione tried to tell anyone, it would be an accepted legend of Hogwarts that Harry Potter could make absolutely anything happen by snapping his fingers. HPMoR Narrator Either Harry Potter had thought of a lot of very good ideas very fast, or for some unimaginable reason he'd already spent a lot of time working out how to fight underwater. HPMoR Harry Potter Harry's Internal Critic promptly awarded him the All-Time Award for the Worst Acting in the History of Ever. wossnames thingummy The Potter M:"...they called me...do you know what they called me?" Flame and air? "The Sorting Hat called you brilliant," he said. "That's a Ravenclaw tie you're wearing. Where's your wand, Myrtle Smith?" wossnames thingummy The Potter "90% of a cup of coffee is the smell. And this is 200% coffee, Jamaican Blue Mountain." And it was working: Myrtle Smith was now available in colour. seventh horcrux Potter In retrospect, I have absolutely no idea how Horcruxes work. seventh horcrux Potter "Gryffindor." S.Hat: "But you would do so well in Slytherin" "I've already done well in Slytherin. Now I want to do well, in *Gryffindor*." seventh horcrux Potter "Hermione," I said sweetly, "Do you want to be friends?" Merlin bless the simple interactions of children. HPMoR Harry Potter (to Umbridge) "I make you this one offer," said the Boy-Who-Lived. "I never learn that you've been interfering with me or any of mine. And you never find out why the unkillable soul-eating monster is scared of me. Now sit down and shut up." HPMoR Greengrass The enemy is attacking Hogwarts students... And Hogwarts, is going to *fight* *back*.”
  165. HPMoR Harry Potter Harry's Internal Critic promptly awarded him the

    All-Time Award for the Worst Acting in the History of Ever.
  166. half-blood prince Harry Potter "He accused me of being Dumbledore's

    man through and through. ... I told him I was." HPMoR Narrator And from that day onward, no matter what Hermione tried to tell anyone, it would be an accepted legend of Hogwarts that Harry Potter could make absolutely anything happen by snapping his fingers. HPMoR Narrator Either Harry Potter had thought of a lot of very good ideas very fast, or for some unimaginable reason he'd already spent a lot of time working out how to fight underwater. HPMoR Harry Potter Harry's Internal Critic promptly awarded him the All-Time Award for the Worst Acting in the History of Ever. wossnames thingummy The Potter M:"...they called me...do you know what they called me?" Flame and air? "The Sorting Hat called you brilliant," he said. "That's a Ravenclaw tie you're wearing. Where's your wand, Myrtle Smith?" wossnames thingummy The Potter "90% of a cup of coffee is the smell. And this is 200% coffee, Jamaican Blue Mountain." And it was working: Myrtle Smith was now available in colour. seventh horcrux Potter In retrospect, I have absolutely no idea how Horcruxes work. seventh horcrux Potter "Gryffindor." S.Hat: "But you would do so well in Slytherin" "I've already done well in Slytherin. Now I want to do well, in *Gryffindor*." seventh horcrux Potter "Hermione," I said sweetly, "Do you want to be friends?" Merlin bless the simple interactions of children. HPMoR Harry Potter (to Umbridge) "I make you this one offer," said the Boy-Who-Lived. "I never learn that you've been interfering with me or any of mine. And you never find out why the unkillable soul-eating monster is scared of me. Now sit down and shut up." HPMoR Greengrass The enemy is attacking Hogwarts students... And Hogwarts, is going to *fight* *back*.”
  167. HPMoR Greengrass The enemy is attacking Hogwarts students... HPMoR Harry

    Potter Harry's Internal Critic promptly awarded him the All-Time Award HPMoR Harry Potter (to Umbridge) "I make you this one offer," said the Boy-Who-Lived. HPMoR Narrator And from that day onward, no matter what Hermione tried to tell anyone, HPMoR Narrator Either Harry Potter had thought of a lot of very good ideas very fast, half-blood prince Harry Potter "He accused me of being Dumbledore's man through and through. seventh horcrux Potter In retrospect, I have absolutely no idea how Horcruxes work. seventh horcrux Potter "Gryffindor." seventh horcrux Potter "Hermione," I said sweetly, "Do you want to be friends?" wossnames thingummyThe Potter M:"...they called me...do you know what they called me?" wossnames thingummyThe Potter "90% of a cup of coffee is the smell. And this is 200% coffee, Jamaican Blue Mountain."
  168. HPMoR Harry Potter Harry's Internal Critic promptly awarded him the

    All-Time Award for the Worst Acting in the History of Ever. Book Name Character Name One or more lines describing the Moment
  169. Book Character First line that described the Moment Tabs HPMoR

    Harry Potter Harry's Internal Critic promptly awarded h…
  170. HPMoR Harry Potter Harry's Internal Critic promptly awarded him the

    All-Time Award for the Worst Acting in the History of Ever. Book Name Character Name One or more lines describing the Moment Book Character First line that described the Moment Tabs HPMoR Harry Potter Harry's Internal Critic promptly awarded h…
  171. idea from
 
 TVTropes.org

  172. SQLite FTW?

  173. GROUP BY book, character ORDER BY book, character

  174. # After writing @lines out to tempfile: @lines = `sort

    -nr -k 5,7 tempfile`; # …versus keeping it in Perl: @lines = { $b->[4] <=> $a->[4] or $b->[6] <=> $a->[6] } @lines;
  175. Summary should be… • clustered by Book • clustered by

    Character • •
  176. Remaining data looks like… • • • multiple (kept-in-order) Moments

    • multiple (kept-in-order) lines of text per Moment.
  177. HoHoAoA • clustered by Book • clustered by Character •

    multiple (kept-in-order) Moments • multiple (kept-in-order) lines of text per Moment.
  178. %main{book}{char}[moment_num][line_num]

  179. %main{book}{char}[moment_num][line_num]

  180. Took a Level in Badass

  181. use 5.010; $/ = ''; # Paragraph mode my %book_char_lines_HoHoAoA;

    while (<>) { chomp; my ( $book, $character, @lines ) = split "\n"; push @{ $book_char_lines_HoHoAoA{$book}{$character} }, [@lines]; } for my $book ( sort keys %book_char_lines_HoHoAoA ) { for my $char ( sort keys %{ $book_char_lines_HoHoAoA{$book} } ) { for my $aref ( @{ $book_char_lines_HoHoAoA{$book}{$char} } ) { say join "\t", $book, $char, $aref->[0]; } } say ''; }
  182. use 5.024; $/ = ''; # Paragraph mode my %book_char_lines_HoHoAoA;

    while (<>) { chomp; my ( $book, $character, @lines ) = split "\n"; push $book_char_lines_HoHoAoA{$book}{$character}->@*, [@lines]; } for my $book ( sort keys %book_char_lines_HoHoAoA ) { for my $char ( sort keys $book_char_lines_HoHoAoA{$book}->%* ) { for my $aref ( $book_char_lines_HoHoAoA{$book}{$char}->@* ) { say join "\t", $book, $char, $aref->[0]; } } say ''; }
  183. my %book_char_lines_HoHoAoA; for 'cmoa.txt'.IO.slurp.split("\n\n") { my ( $book, $character, @lines

    ) = .split: "\n"; push %book_char_lines_HoHoAoA{$book}{$character}, [@lines]; } for %book_char_lines_HoHoAoA .keys.sort -> $book { for %book_char_lines_HoHoAoA{$book} .keys.sort -> $char { for %book_char_lines_HoHoAoA{$book}{$char}.list -> @lines { say join "\t", $book, $char, @lines[0]; } } say ''; }
  184. my %book_char_lines_HoHoAoA; for 'cmoa.txt'.IO.slurp.split("\n\n") { my ( $book, $character, @lines

    ) = .split: "\n"; push %book_char_lines_HoHoAoA{$book}{$character}, [@lines]; } # More efficient. Less readable, or more readable? for %book_char_lines_HoHoAoA.sort -> (:key($book), :value(%book_hash )) { for %book_hash\ .sort -> (:key($char), :value(@char_array)) { for @char_array -> @lines { say join "\t", $book, $char, @lines[0]; } } say ''; }
  185. If that sounds good…

  186. Deep DS Resources • https://perldoc.perl.org/perldsc.html
 Perl: see also: perllol, perlref,

    and perlreftut.html • https://www.perlmonks.org/?node_id=172833
 Auto-Vivification Explanation • https://web.stanford.edu/class/archive/cs/cs106ap/ cs106ap.1198/lectures/15-NestedCollections/15- Nested_Data_Structures.pdf
 Python: Nested DS starts at page 70
  187. Trees

  188. %main{book}{char}[moment_num][line_num]

  189. Trees in the wild • Objects (Nodes) • Direction (Edges)

    • Hierarchy (Parent/Child) • Order (Siblings) • Navigation (Methods)
  190. <html><head><title>Doc 1</title></head> <body> Stuff <hr> 2000-08-17 </body></html>

  191. <html><head><title>Doc 1</title></head> <body> Stuff <hr> 2000-08-17 </body></html> <html> <head> <title>

    Doc 1 </title> </head> <body> Stuff <hr> 2000-08-17 </body> </html>
  192. <html><head><title>Doc 1</title></head> <body> Stuff <hr> 2000-08-17 </body></html> • html •

    head • title • "Doc 1" • body • "Stuff" • hr • "2000-08-17" <html> <head> <title> Doc 1 </title> </head> <body> Stuff <hr> 2000-08-17 </body> </html>
  193. <html><head><title>Doc 1</title></head> <body> Stuff <hr> 2000-08-17 </body></html> • html •

    head • title • "Doc 1" • body • "Stuff" • hr • "2000-08-17" html / \ head body / / | \ title "Stuff" hr "2000-08-17" | "Doc 1" <html> <head> <title> Doc 1 </title> </head> <body> Stuff <hr> 2000-08-17 </body> </html>
  194. html / \ head body / / | \ title

    "Stuff" hr "2000-08-17" | "Doc 1"
  195. html / \ head body / / | \ title

    "Stuff" hr "2000-08-17" | "Doc 1"
  196. <html><head><title>Doc 1</title></head> <body> Stuff <hr> 2000-08-17 </body></html>

  197. None
  198. None
  199. None
  200. None
  201. None
  202. None
  203. None
  204. curl -o weather.html 'https://www.wunderground.com/weather/us/al/auburn'

  205. None
  206. <div class="condition-wind small-6 medium-12 columns"> <div class="wind-compass-wrap"> <div class="wind-compass" style="transform:rotate(313deg);">

    <div class="dial"> <div class="arrow-direction"> </div> </div> </div> <div class="wind-north"> N </div> <strong> 0 </strong> </div> <p class="ng-star-inserted"> Gusts <strong> <span class="test-false wu-unit wu-unit-speed"> <span class="wu-value wu-value-to" style="color:;"> 2 </span> <span class="wu-label"> <span class="ng-star-inserted"> mph </span> </span> </span> </strong> </div> </div>
  207. <div class="condition-wind small-6 medium-12 columns"> <div class="wind-compass-wrap"> <div class="wind-compass" style="transform:rotate(313deg);">

    <div class="dial"> <div class="arrow-direction"> </div> </div> </div> <div class="wind-north"> N </div> <strong> 0 </strong> </div> <p class="ng-star-inserted"> Gusts <strong> <span class="test-false wu-unit wu-unit-speed"> <span class="wu-value wu-value-to" style="color:;"> 2 </span> <span class="wu-label"> <span class="ng-star-inserted"> mph </span> </span> </span> </strong> </div> <!-- XXX should close Paragraph not Div --> </div>
  208. <div class="condition-wind small-6 medium-12 columns"> <div class="wind-compass-wrap"> <div class="wind-compass" style="transform:rotate(313deg);">

    <div class="dial"> <div class="arrow-direction"> </div> </div> </div> <div class="wind-north"> N </div> <strong> 0 </strong> </div> <p class="ng-star-inserted"> Gusts <strong> <span class="test-false wu-unit wu-unit-speed"> <span class="wu-value wu-value-to" style="color:;"> 2 </span> <span class="wu-label"> <span class="ng-star-inserted"> mph </span> </span> </span> </strong> </div> </div>
  209. <div class="condition-wind small-6 medium-12 columns"> <div class="wind-compass-wrap"> <div class="wind-compass" style="transform:rotate(313deg);">

    <div class="dial"> <div class="arrow-direction"> </div> </div> </div> <div class="wind-north"> N </div> <strong> 0 </strong> </div> <p class="ng-star-inserted"> Gusts <strong> <span class="test-false wu-unit wu-unit-speed"> <span class="wu-value wu-value-to" style="color:;"> 2 </span> <span class="wu-label"> <span class="ng-star-inserted"> mph </span> </span> </span> </strong> </div> </div>
  210. <div class="condition-wind small-6 medium-12 columns"> <div class="wind-compass-wrap"> <div class="wind-compass" style="transform:rotate(313deg);">

    <div class="dial"> <div class="arrow-direction"> </div> </div> </div> <div class="wind-north"> N </div> <strong> 0 </strong> </div> <p class="ng-star-inserted"> Gusts <strong> <span class="test-false wu-unit wu-unit-speed"> <span class="wu-value wu-value-to" style="color:;"> 2 </span> <span class="wu-label"> <span class="ng-star-inserted"> mph </span> </span> </span> </strong> </div>
  211. use Modern::Perl; use File::Slurp; $_ = read_file('weather.html'); s{\A.+<div [^>]*class="condition-wind[^>]*>((?:.+?</div>){6}).+\z}{$1}ms; s{</?[^>]+>}{

    }msg; s{&nbsp;} { }msg; tr{ }{}s; s{^\s+}{}; s{\s+$}{}; say;
  212. use Modern::Perl; use File::Slurp; $_ = read_file('weather.html'); s{\A.+<div [^>]*class="condition-wind[^>]*>((?:.+?</div>){6}).+\z}{$1}ms; s{</?[^>]+>}{

    }msg; s{&nbsp;} { }msg; tr{ }{}s; s{^\s+}{}; s{\s+$}{}; say;
  213. use Modern::Perl; use File::Slurp; $_ = read_file('weather.html'); s{\A.+<div [^>]*class="condition-wind[^>]*>((?:.+?</div>){6}).+\z}{$1}ms; s{</?[^>]+>}{

    }msg; s{&nbsp;} { }msg; tr{ }{}s; s{^\s+}{}; s{\s+$}{}; say;
  214. use Modern::Perl; use File::Slurp; $_ = read_file('weather.html'); s{\A.+<div [^>]*class="condition-wind[^>]*>((?:.+?</div>){6}).+\z}{$1}ms; s{</?[^>]+>}{

    }msg; s{&nbsp;} { }msg; tr{ }{}s; s{^\s+}{}; s{\s+$}{}; say;
  215. use Modern::Perl; use File::Slurp; $_ = read_file('weather.html'); s{\A.+<div [^>]*class="condition-wind[^>]*>((?:.+?</div>){6}).+\z}{$1}ms; s{</?[^>]+>}{

    }msg; s{&nbsp;} { }msg; tr{ }{}s; s{^\s+}{}; s{\s+$}{}; say;
  216. use Modern::Perl; use File::Slurp; $_ = read_file('weather.html'); s{\A.+<div [^>]*class="condition-wind[^>]*>((?:.+?</div>){6}).+\z}{$1}ms; s{</?[^>]+>}{

    }msg; s{&nbsp;} { }msg; tr{ }{}s; s{^\s+}{}; s{\s+$}{}; say;
  217. N 0 Gusts 2 mph

  218. my $from_div_to_strong = / '<div ' <-[>]>* 'class="condition-wind' <-[>]>* '>'

    [ .+? '</div>' ] ** 6 /; my $tag_or_nbsp = / '<' '/'? <-[>]>+ '>' | '&nbsp;' /; say 'weather.html'.IO.slurp\ .match( $from_div_to_strong )\ .subst( $tag_or_nbsp, ' ', :global )\ .trans( ' ' => ' ', :squash )\ .trim;
  219. use TokeParser;

  220. <strong class="Humongulus" > primary clank in this location </strong>

  221. <strong class="Humongulus" > primary clank in this location </strong>

  222. <strong class="Humongulus" > primary clank in this location </strong>

  223. <strong class="Humongulus" > primary clank in this location </strong>

  224. <strong class="Humongulus"> primary clank in this location </strong>

  225. use Modern::Perl; use HTML::TokeParser; my $p = HTML::TokeParser->new('weather.html') or die

    "Can't open: $!"; $p->empty_element_tags(1); my $in_wind = 0; my $level = 0; my @texts; # this_loop_is_on_the_next_slide(); s/&nbsp;/ /g for @texts; say @texts;
  226. while ( my $t = $p->get_token ) { my $type

    = $t->[0]; if ( $type eq 'S' ) { my ( $tag, $attr ) = @{$t}[1,2]; $in_wind = 1 if $tag eq 'div' and ($attr->{class} // '') =~ /^condition-wind /; $level++ if $in_wind; } elsif ( $type eq 'E' and $in_wind ) { my $tag = $t->[1]; $level--; $in_wind = 0 if $level == 0; } elsif ( $type eq 'T' and $in_wind ) { push @texts, $t->[1]; } }
  227. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  228. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  229. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  230. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  231. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  232. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  233. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  234. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  235. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  236. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  237. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  238. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  239. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  240. while ( my $t = $p->get_token ) { last if

    $t->[0] eq 'S' and $t->[1] eq 'div' and($t->[2]{class} // '') =~ /^condition-wind /; } my ( $level, @texts ) = (1); while ( my $t = $p->get_token ) { if ( $t->[0] eq 'S' ) { $level++ } elsif ( $t->[0] eq 'E' ) { $level--; last if !$level; } elsif ( $t->[0] eq 'T' ) { push @texts, $t->[1]; } }
  241. use HTML::TreeBuilder;

  242. use Modern::Perl; use HTML::TreeBuilder; my $tree = HTML::TreeBuilder->new_from_file('weather.html') or die;

  243. use Modern::Perl; use HTML::TreeBuilder; my $tree = HTML::TreeBuilder->new_from_file('weather.html') or die;

    my $wind = $tree->look_down(
  244. use Modern::Perl; use HTML::TreeBuilder; my $tree = HTML::TreeBuilder->new_from_file('weather.html') or die;

    my $wind = $tree->look_down( _tag => 'div',
  245. use Modern::Perl; use HTML::TreeBuilder; my $tree = HTML::TreeBuilder->new_from_file('weather.html') or die;

    my $wind = $tree->look_down( _tag => 'div', class => qr/^condition-wind /, );
  246. use Modern::Perl; use HTML::TreeBuilder; my $tree = HTML::TreeBuilder->new_from_file('weather.html') or die;

    my $wind = $tree->look_down( _tag => 'div', class => qr/^condition-wind /, ); $wind
  247. use Modern::Perl; use HTML::TreeBuilder; my $tree = HTML::TreeBuilder->new_from_file('weather.html') or die;

    my $wind = $tree->look_down( _tag => 'div', class => qr/^condition-wind /, ); $wind->as_trimmed_text
  248. use Modern::Perl; use HTML::TreeBuilder; my $tree = HTML::TreeBuilder->new_from_file('weather.html') or die;

    my $wind = $tree->look_down( _tag => 'div', class => qr/^condition-wind /, ); $wind->as_trimmed_text( extra_chars => '\xA0' );
  249. use Modern::Perl; use HTML::TreeBuilder; my $tree = HTML::TreeBuilder->new_from_file('weather.html') or die;

    my $wind = $tree->look_down( _tag => 'div', class => qr/^condition-wind /, ); say $wind->as_trimmed_text( extra_chars => '\xA0' );
  250. N 0 Gusts 2 mph

  251. use Modern::Perl; use HTML::TreeBuilder; my $tree = HTML::TreeBuilder->new_from_file('weather.html') or die;

    my $wind = $tree->look_down( _tag => 'div', class => qr/^condition-wind /, ); say $wind->as_trimmed_text( extra_chars => '\xA0' );
  252. use Modern::Perl; use HTML::TreeBuilder; my $tree = HTML::TreeBuilder->new_from_file('weather.html') or die;

    my $wind = $tree->look_down( _tag => 'div', class => qr/^condition-wind /, ); say $wind->as_trimmed_text( extra_chars => '\xA0' );
  253. Micro-languages • Strings • Regex • Trees • XPath •

    JSON • jq • JSONPath
  254. Tree Resources • https://metacpan.org/pod/HTML::Tree::Scanning • https://www.perlmonks.org/?node_id=153259
 Introduction to the more

    generic Tree::DAG_Node • https://perlhacks.com/2014/04/data-munging-perl/
 Book: Data Munging with Perl
 Now freely available for download! • https://docs.python-guide.org/scenarios/scrape/ • https://stackoverflow.com/questions/14172028/html- parse-tree-using-python-2-7
 Answers also for Javascript
  255. Q&A

  256. Refactoring and Readability 1.5:
 The Pre-Sequel http://speakerdeck.com/util
 
 <bruce.gray@acm.org> <<<

    >>> >>> <<<
  257. Bonus Round

  258. Static Analysis • Lint (and other linters) • Scanners for

    style issues and common bugs: • Perl::Critic • RubyCritic • Rubocop • Prospector
 • Scanners for security issues: • Brakeman • Bandit • Assisted Refactoring • PyCharm • ReSharper
  259. Static Analysis • Lint (and other linters) • Scanners for

    style issues and common bugs: • Perl::Critic • RubyCritic • Rubocop • Prospector
 • Scanners for security issues: • Brakeman • Bandit • Assisted Refactoring • PyCharm • ReSharper • Automated Refactoring • Blue Tiger
  260. Refactoring and Readability 1.5:
 The Pre-Sequel http://speakerdeck.com/util
 
 <bruce.gray@acm.org> <<<

    >>> >>> <<<
  261. Thanks!

  262. Copyrights

  263. Copyright Information: Images • Camelia • (c) 2009 by Larry

    Wall
 http://github.com/perl6/mu/raw/master/misc/ camelia.txt • Regular Expressions • © Randall Munroe
 https://xkcd.com/208/ • Brantley Grayson Wolters • (c) 2011 by his mother, Amy Wolters
 Util 1.1.1 is my eldest grandchild
 (in RCS numbering)
  264. Copyright Information: This Talk This work is licensed under a

    Creative Commons Attribution 4.0 International License. CC BY https://creativecommons.org/licenses/by/4.0/ (email me for the original Apple Keynote .key file)
  265. History • v 1.01 2020-06-12
 Presented at Southeast LinuxFest
 60

    minutes with Q&A • v 1.02 2020-06-24
 Presented at Perl & Raku Conference in the Cloud
 50 minutes with Q&A
  266. Removed
 (Not presented, 
 but maybe worth reading)

  267. Key knowledge (Perl) • List::Util first max min sum all

    any none uniq shuffle • List::UtilsBy max_by min_by count_by • perlfunc Perl Functions by Category • Test::Tutorial Intro to Automated Testing • Devel::Cover Shows code that lacks testing • Benchmark Performance comparisons • Devel::NYTProf Perfomance profiler
  268. my @to_keep = grep { $name[$i] !~ /herman/i } keys

    @name; @name = @name[@to_keep]; @addr = @addr[@to_keep]; @city = @city[@to_keep];
  269. <div class="condition-wind small-6 medium-12 columns"> <div class="wind-compass-wrap"> <div class="wind-compass" style="transform:rotate(313deg);">

    <div class="dial"> <div class="arrow-direction"> <div class="wind-north"> "N" <strong> "0" <p class="ng-star-inserted"> " Gusts " <strong> <span class="test-false wu-unit wu-unit-speed ng-star-inserted"> <span class="wu-value wu-value-to" style="color:;"> "2" <span class="wu-label"> <span class="ng-star-inserted"> "mph"
  270. XXX If that sounds good…

  271. Static Analysis • Lint (and other linters) • Scanners for

    style issues and common bugs: • Perl::Critic • RubyCritic • Rubocop • Prospector
 • Scanners for security issues: • Brakeman • Bandit • Assisted Refactoring • PyCharm • ReSharper • Automated Refactoring • Blue Tiger
  272. • TODO: • Add Regex history • Credit Damian on

    P6 Regex • Review all long slide text, and split
  273. Raku == 
 Perl 5 minus Warts plus Awesome