Slide 1

Slide 1 text

Sorting Whatever* in $LANG
 The Perl and Raku Conference
 
 2023-07-11

Slide 2

Slide 2 text

Sorting Whatever* in $LANG PLEASE DOWNLOAD THESE SLIDES! 
 http://speakerdeck.com/util
 <<< >>> >>> <<<

Slide 3

Slide 3 text

/me
 
 Bruce Gray


Slide 4

Slide 4 text

/me
 
 Bruce Gray
 
 'Util'

Slide 5

Slide 5 text

Sorting Whatever* in $LANG

Slide 6

Slide 6 text

Bead sort Bogo sort Bubble sort Circle sort Cocktail sort Comb sort Counting sort Cycle sort Gnome sort Heap sort Insertion sort Merge sort Pancake sort Patience sort Permutation sort Quick sort Radix sort Selection sort Shell sort Sleep sort Stooge sort Strand sort

Slide 7

Slide 7 text

Bead sort Bogo sort Bubble sort Circle sort Cocktail sort Comb sort Counting sort Cycle sort Gnome sort Heap sort Insertion sort Merge sort Pancake sort Patience sort Permutation sort Quick sort Radix sort Selection sort Shell sort Sleep sort Stooge sort Strand sort

Slide 8

Slide 8 text

https://rosettacode.org/wiki/ Category:Sorting_Algorithms

Slide 9

Slide 9 text

No locale / collation

Slide 10

Slide 10 text

Order of Battle • Intro and Summary • Comparators (This is everything) • Pitfalls (Foot-guns) • Perl sorting (Clearer/Faster) • Raku sorting (Turned up to 11) • Q&A (Bonus: when not to sort)

Slide 11

Slide 11 text

Order of Battle • Intro and Summary • Comparators (This is everything) • Pitfalls (Foot-guns) • Perl sorting (Clearer/Faster) • Raku sorting (Turned up to 11) • Q&A (Bonus: when not to sort) Story Story Story

Slide 12

Slide 12 text

Long, Long Ago in a $LANGUAGE far far away

Slide 13

Slide 13 text

void qsort( void * address_of_first_element, size_t number_of_elements, size_t width_of_each_element, int (*compar)(const void *, const void *) );

Slide 14

Slide 14 text

void qsort( void * address_of_first_element, size_t number_of_elements, size_t width_of_each_element, int (*compar)(const void *, const void *) );

Slide 15

Slide 15 text

void qsort( void * address_of_first_element, size_t number_of_elements, size_t width_of_each_element, int (*compar)(const void *, const void *) );

Slide 16

Slide 16 text

void qsort( void * address_of_first_element, size_t number_of_elements, size_t width_of_each_element, int (*compar)(const void *, const void *) );

Slide 17

Slide 17 text

What is fi rst? green,blue blue,green

Slide 18

Slide 18 text

Answer the Q • How will the Q be asked • How will we determine the answer • How we will speak (return) the answer

Slide 19

Slide 19 text

-1 0 1 Worldwide Standard

Slide 20

Slide 20 text

$a , $b Fast, but weird

Slide 21

Slide 21 text

@evens = map { $_ * 2 } 0 .. 9; @odds = grep { $_ % 2 } 0 .. 20; # One alias: $_ @sorted = sort { $a <=> $b } (4,4,8,5,2); # Two aliases: $a and $b

Slide 22

Slide 22 text

A comes before B Less < lt A comes after B More > ge A is same as B Same == eq

Slide 23

Slide 23 text

A comes before B Less < lt A is same as B Same == eq A comes after B More > ge

Slide 24

Slide 24 text

< == > lt eq gt Less Same More -1 0 1

Slide 25

Slide 25 text

die unless '7' lt '8'; die unless '8' lt '9'; die unless '9' lt '10'; # Died 7 8 9 10

Slide 26

Slide 26 text

@sorted = sort { ($a lt $b) ? -1 : ($a gt $b) ? 1 : 0 } @names;

Slide 27

Slide 27 text

@sorted = sort { # 1 or 2 compares ($a lt $b) ? -1 : ($a gt $b) ? 1 : 0 } @names;

Slide 28

Slide 28 text

@sorted = sort { # 1 or 2 compares ($a lt $b) ? -1 : ($a gt $b) ? 1 : 0 } @names; @sorted = sort { $a cmp $b } @names;

Slide 29

Slide 29 text

@sorted = sort { # 1 or 2 compares ($a lt $b) ? -1 : ($a gt $b) ? 1 : 0 } @names; # only 1 compare! @sorted = sort { $a cmp $b } @names;

Slide 30

Slide 30 text

< == > <=> lt eq gt cmp Less Same More -1 0 1 -1,0,1

Slide 31

Slide 31 text

@sorted = sort { $b <=> $a } @input; @sorted = sort { $a <=> $b } @input; @sorted = sort { $b cmp $a } @input; @sorted = sort { $a cmp $b } @input;

Slide 32

Slide 32 text

@sorted = sort { $b <=> $a } @input; @sorted = sort { $a <=> $b } @input; @sorted = sort { $b cmp $a } @input; @sorted = sort { $a cmp $b } @input; @sorted = sort @input;

Slide 33

Slide 33 text

Manufactured Columns Grouping, and Name Badges

Slide 34

Slide 34 text

my %colors = ( green => 1, blue => 2 ); @sorted = sort { $colors{$a} <=> $colors{$b} . or $a cmp $b } @input;

Slide 35

Slide 35 text

my %colors = ( green => 1, blue => 2 ); @sorted = sort { ($colors{$a} // 0) <=> ($colors{$b} // 0) or $a cmp $b } @input;

Slide 36

Slide 36 text

my %colors = ( green => 1, blue => 2 ); @sorted = sort { ($colors{$a} // 99) <=> ($colors{$b} // 99) or $a cmp $b } @input;

Slide 37

Slide 37 text

In-place vs Streaming E ff i cient vs Convenient

Slide 38

Slide 38 text

• Python • … 2.3: In-place `sort()` • 2.4+: Streaming `sorted()` • Perl and Raku • All streaming, all the time

Slide 39

Slide 39 text

Stability

Slide 40

Slide 40 text

say for sort { substr($a,-1,1) cmp substr($b,-1,1) } qw;

Slide 41

Slide 41 text

@n = sort { ($a % 2) <=> ($b % 2) or $a <=> $b } @n; @n = sort { ($a % 2) <=> ($b % 2) } sort { $a <=> $b } @n; 2 4 6 8 1 3 5 7 9

Slide 42

Slide 42 text

perl -MO=Concise -e … | grep sort One sort: 7 <@> sort lKS*/INPLACE ->8 Two sorts: 9 <@> sort lKS* ->a 8 <@> sort lKM/NUM ->9

Slide 43

Slide 43 text

perl -MO=Concise -e 'print 42 - 1;' 6 <@> leave[1 ref] vKP/REFC ->(end) 1 <0> enter v ->2 2 <;> nextstate(main 1 -e:1) v:{ ->3 5 <@> print vK ->6 3 <0> pushmark s ->4 4 <$> const(IV 41) s/FOLD ->5 -e syntax OK

Slide 44

Slide 44 text

perl -MO=Concise -e 'print 42 - 1;' 6 <@> leave[1 ref] vKP/REFC ->(end) 1 <0> enter v ->2 2 <;> nextstate(main 1 -e:1) v:{ ->3 5 <@> print vK ->6 3 <0> pushmark s ->4 4 <$> const(IV 41) s/FOLD ->5 -e syntax OK https://perldoc.perl.org/B::Concise

Slide 45

Slide 45 text

Orcish Maneuver

Slide 46

Slide 46 text

Orcish Maneuver Or - Cache

Slide 47

Slide 47 text

my @sorted = sort { expensive($a) <=> expensive($b) } @stuff;

Slide 48

Slide 48 text

my %h; my @sorted = sort { ( $h{$a} //= expensive($a) ) <=> ( $h{$b} //= expensive($b) ) } @stuff;

Slide 49

Slide 49 text

time perl -wE ' my @L = `head -1000000 find_20230614.txt`; chomp @L; my %h; say( ( sort { ($h{$b} //= (-s($b) // 0)) <=> ($h{$a} //= (-s($a) // 0)) } @L )[0] ); ' 1m54s ==> 21s 21M `stat` ==> 1M

Slide 50

Slide 50 text

time perl -wE ' my @L = `head -1000000 find_20230614.txt`; chomp @L; my %h; say( ( sort { -s($b) <=> -s($a) } @L )[0] ); '

Slide 51

Slide 51 text

time perl -wE ' my @L = `head -1000000 find_20230614.txt`; chomp @L; my %h; say( ( sort { (-s($b) // 0) <=> (-s($a) // 0) } @L )[0] ); '

Slide 52

Slide 52 text

time perl -wE ' my @L = `head -1000000 find_20230614.txt`; chomp @L; my %h; say( ( sort { ($h{$b} //= (-s($b) // 0)) <=> ($h{$a} //= (-s($a) // 0)) } @L )[0] ); '

Slide 53

Slide 53 text

time perl -wE ' my @L = `head -1000000 find_20230614.txt`; chomp @L; say $L[-1]; ' 0.801s

Slide 54

Slide 54 text

There are only two hard problems in computer science: 
 cache expiration, naming things, and o ff -by-one errors. https://www.martinfowler.com/bliki/TwoHardThings.html

Slide 55

Slide 55 text

S. T.

Slide 56

Slide 56 text

S. T. Schwartzian Transform

Slide 57

Slide 57 text

Decorate - Sort - Undecorate

Slide 58

Slide 58 text

Decorate - Sort - Undecorate Schwartzian Transform

Slide 59

Slide 59 text

( Value9, Value5, … ) ( [Calc9, Value9], [Calc5, Value5], … ) ( [Calc1, Value1], [Calc2, Value2], … ) ( Value1, Value2, … )

Slide 60

Slide 60 text

( Value9, Value5, … ) ( [Calc9, Value9], [Calc5, Value5], … ) ( [Calc1, Value1], [Calc2, Value2], … ) ( Value1, Value2, … ) Each element in the original list

Slide 61

Slide 61 text

( Value9, Value5, … ) ( [Calc9, Value9], [Calc5, Value5], … ) ( [Calc1, Value1], [Calc2, Value2], … ) ( Value1, Value2, … ) Each element in the original list becomes a 2-element array

Slide 62

Slide 62 text

( Value9, Value5, … ) ( [Calc9, Value9], [Calc5, Value5], … ) ( [Calc1, Value1], [Calc2, Value2], … ) ( Value1, Value2, … ) Each element in the original list that gets sorted using the key becomes a 2-element array

Slide 63

Slide 63 text

( Value9, Value5, … ) ( [Calc9, Value9], [Calc5, Value5], … ) ( [Calc1, Value1], [Calc2, Value2], … ) ( Value1, Value2, … ) Each element in the original list and changed back to a single value that gets sorted using the key becomes a 2-element array

Slide 64

Slide 64 text

( Value9, Value5, … ) ( [Calc9, Value9], [Calc5, Value5], … ) ( [Calc1, Value1], [Calc2, Value2], … ) ( Value1, Value2, … ) Each element in the original list and changed back to a single value that gets sorted using the key becomes a 2-element array M ap

Slide 65

Slide 65 text

( Value9, Value5, … ) ( [Calc9, Value9], [Calc5, Value5], … ) ( [Calc1, Value1], [Calc2, Value2], … ) ( Value1, Value2, … ) Each element in the original list and changed back to a single value that gets sorted using the key becomes a 2-element array M ap Sort

Slide 66

Slide 66 text

( Value9, Value5, … ) ( [Calc9, Value9], [Calc5, Value5], … ) ( [Calc1, Value1], [Calc2, Value2], … ) ( Value1, Value2, … ) Each element in the original list and changed back to a single value that gets sorted using the key becomes a 2-element array M ap Sort M ap

Slide 67

Slide 67 text

time perl -wE ' my @L = `head -1000000 find_20230614.txt`; chomp @L; @L = map { $_->[1] } sort { $b->[0] <=> $a->[0] } map {[ -s($_) // 0, $_ ]} @L; say $L[0];'

Slide 68

Slide 68 text

GRT

Slide 69

Slide 69 text

GRT Guttman-Rosler Transform

Slide 70

Slide 70 text

time perl -wE ' my @L = `head -1000000 find_20230614.txt`; chomp @L; @L = map { unpack "x[L] a*" } sort map { pack "L a*",-s($_) // 0, $_ } @L; say $L[0];'

Slide 71

Slide 71 text

• The more you know about your data, the more you can optimize. But then, the more you are hard-coding the data patterns into the design! • If you know your data is always in order, except what you just pushed on the end, which is just before or after the prior tail, you can sort 2 elements. What happens when your pattern changes? 4 6 8 8 7 Technically in-spec! Fails due to the duplicate 8.

Slide 72

Slide 72 text

Raku

Slide 73

Slide 73 text

Raku Features Dovetail to Tiny Sorts

Slide 74

Slide 74 text

• WhateverCode* • &[cmp] generic comparator • $^a placeholder params • +,~ context operators • .arity introspection • cmp on lists

Slide 75

Slide 75 text

Context Operators + - ~ ? ! //

Slide 76

Slide 76 text

$x = 0 + $y; $x = '' . $y; $x = !! $y;

Slide 77

Slide 77 text

$x = 0 + $y; $x = '' . $y; $x = !! $y; $x = +$y; $x = ~$y; $x = ?$y; Raku:

Slide 78

Slide 78 text

Code x 4

Slide 79

Slide 79 text

sub f1 ($x,$y,$z) { $x + $y * $z }; my &f2 = -> $x,$y,$z { $x + $y * $z }; my &f3 = { $^x + $^y * $^z }; my &f4 = * + * * * ; # (3 98) for &f1,&f2,&f3,&f4 -> Code $f { # (3 98) say ( $f.arity, $f.(8,9,10) ); # (3 98) } # (3 98) Whatever Star Arrow block Placeholder params Subroutine

Slide 80

Slide 80 text

Code Any Callable WhateverCode Block Routine Sub Method Macro Submethod

Slide 81

Slide 81 text

my &f3 = { $^x + $^y * $^z }; my &f4 = * + * * * ; Whatever Star Placeholder params

Slide 82

Slide 82 text

my &f4 = * + * * * ; say @nums.grep({ $_ > 9 }); Whatever Star

Slide 83

Slide 83 text

my &f4 = * + * * * ; say @nums.grep({ $_ > 9 }); say @nums.grep( * > 9 ); Whatever Star

Slide 84

Slide 84 text

my &f3 = { $^x + $^y * $^z }; my &f4 = * + * * * ; Whatever Star Placeholder params

Slide 85

Slide 85 text

my &f3 = { $^x + $^y * $^z }; Placeholder params

Slide 86

Slide 86 text

my &f3 = { $^x + $^y * $^z }; { substr $^string, $^endpoint }

Slide 87

Slide 87 text

my &f3 = { $^x + $^y * $^z }; { substr $^string, $^endpoint } sub ( $endpoint, $string ) { return substr $string, $endpoint; }

Slide 88

Slide 88 text

String Generic Numeric Perl: cmp <=> Raku: leg cmp <=>

Slide 89

Slide 89 text

self mutating method call op

Slide 90

Slide 90 text

$n = $n + 42;

Slide 91

Slide 91 text

$n = $n + 42; $n += 42;

Slide 92

Slide 92 text

$n = $n + 42; $n += 42; $s = $s ~ 'xyz';

Slide 93

Slide 93 text

$n = $n + 42; $n += 42; $s = $s ~ 'xyz'; $s ~= 'xyz';

Slide 94

Slide 94 text

$n = $n + 42; $n += 42; $s = $s ~ 'xyz'; $s ~= 'xyz'; @L = @L.sort...

Slide 95

Slide 95 text

$n = $n + 42; $n += 42; $s = $s ~ 'xyz'; $s ~= 'xyz'; @L = @L.sort... @L .= sort...

Slide 96

Slide 96 text

insert placeholder explaination

Slide 97

Slide 97 text

@L .= sort({ $^b.IO.s <=> $^a.IO.s });

Slide 98

Slide 98 text

@L .= sort({ $^b.IO.s cmp $^a.IO.s });

Slide 99

Slide 99 text

@L .= sort({ $^b.IO.s cmp $^a.IO.s }); @L .= sort({ - .IO.s });

Slide 100

Slide 100 text

@L .= sort({ $^b.IO.s cmp $^a.IO.s }); @L .= sort({ - .IO.s }); @L .= sort: -*.IO.s;

Slide 101

Slide 101 text

@L .= sort: -*.IO.s;

Slide 102

Slide 102 text

@L .= sort: { -.IO.s, .IO.modified, $_ };

Slide 103

Slide 103 text

Further Reading • https://perldoc.perl.org/functions/sort • Very detailed, but still mentions the defunct `use sort` pragma. • https://perldoc.perl.org/sort • The old `use sort` pragma. • https://rosettacode.org/wiki/Category:Sorting_Algorithms • All the sorts, in all the languages!!!! • https://blogs.perl.org/users/bruce_gray/2023/02/twc-205- exclusive-third-or- fi rst.html • Yves Orton's comment on Heaps - super-fast for Priority Queues!

Slide 104

Slide 104 text

Q & A

Slide 105

Slide 105 text

Sorting Whatever* in $LANG PLEASE DOWNLOAD THESE SLIDES! 
 http://speakerdeck.com/util
 <<< >>> >>> <<<

Slide 106

Slide 106 text

When and How to Not Sort

Slide 107

Slide 107 text

• Just Maximum, Minimum? - use a O(N) linear scan • Perl: use List::Util qw Raku: .max / .min / .maxpairs / .minpairs • Repeatedly need max/min after each array update? • Use Heap or Priority Queue • Always needs to be in-order • Really depends on more details of your needs. Could be a module, or may need a external DB.

Slide 108

Slide 108 text

Much Thanks
 to You All!

Slide 109

Slide 109 text

Copyrights

Slide 110

Slide 110 text

Copyright Information: Images and Video • Camelia • © 2009 by Larry Wall
 http://github.com/perl6/mu/raw/master/misc/camelia.txt • Sorts 2018 - Color Circle (The video before the fi rst slide) • © 2018 by w0rthy
 https://www.youtube.com/watch?v=sVYtGyPiGik
 https://github.com/w0rthy

Slide 111

Slide 111 text

Copyright Information: This Talk This work is licensed under a Creative Commons Attribution 4.0 International License. CC BY https://creativecommons.org/licenses/by/4.0/ (email me for the original Apple Keynote .key fi le)

Slide 112

Slide 112 text

History • v 1.00 2023-07-11
 Presented fi nal version to The Perl and Raku Conference in Toronto, ON, CA


Slide 113

Slide 113 text

To remove, or fi t in

Slide 114

Slide 114 text

No content

Slide 115

Slide 115 text

le ge you would core-dump non-identical ties are needed before you care about tie-breakers? last-letter.