Link
Embed
Share
Beginning
This slide
Copy link URL
Copy link URL
Copy iframe embed code
Copy iframe embed code
Copy javascript embed code
Copy javascript embed code
Share
Tweet
Share
Tweet
Slide 1
Slide 1 text
Advances in Multilingual Stemming on CPAN Nick Patch @nickpatch Shutterstock
Slide 2
Slide 2 text
hacking → hack hacker → hack hacked → hack hack → hack
Slide 3
Slide 3 text
hacking → hack hacker → hack hacked → hack hack → hack
Slide 4
Slide 4 text
hacking → hack hacker → hack hacked → hack hack → hack
Slide 5
Slide 5 text
hacking → hack hacker → hack hacked → hack hack → hack
Slide 6
Slide 6 text
hacking → hack hacker → hack hacked → hack hack → hack
Slide 7
Slide 7 text
hacking → hack hacker → hack hacked → hack hack → hack
Slide 8
Slide 8 text
hacking → hack hacker → hack hacked → hack hack → hack
Slide 9
Slide 9 text
hacking → hack hacker → hack hacked → hack hack → hack
Slide 10
Slide 10 text
gurgled → gurgl
Slide 11
Slide 11 text
gurgled → gurgl
Slide 12
Slide 12 text
gurgled → gurgl gurgling → gurgl
Slide 13
Slide 13 text
gurgled → gurgl gurgling → gurgl
Slide 14
Slide 14 text
gurgled → gurgl gurgling → gurgl gurgle → gurgl
Slide 15
Slide 15 text
gurgled → gurgl gurgling → gurgl gurgle → gurgl
Slide 16
Slide 16 text
stem("hacker")
Slide 17
Slide 17 text
stem("hacker") eq stem("hacking")
Slide 18
Slide 18 text
indexer(stem("hacker"))
Slide 19
Slide 19 text
indexer(stem("hacker")) lookup(stem("hacking"))
Slide 20
Slide 20 text
Lingua::Stem::Any
Slide 21
Slide 21 text
Lingua::Stem::Any bg cs da de en eo es fa f fr gl hu io it nl no pt ro ru sv tr
Slide 22
Slide 22 text
use Lingua::Stem::Any; $stemmer = Lingua::Stem::Any->new( language => $language ); $stem = $stemmer->stem($word);
Slide 23
Slide 23 text
Attributes language source cache exceptions casefold normalize
Slide 24
Slide 24 text
Methods stem($word) stem(@words) stem_in_place(\@words)
Slide 25
Slide 25 text
Methods languages languages($source) sources sources($lang) clear_cache
Slide 26
Slide 26 text
Lingua::Stem::UniNE::CS Czech Image by NuclearVacuum on Wikimedia Commons / CC BY-SA 3.0
Slide 27
Slide 27 text
Lingua::Stem::UniNE::CS Czech Bulgarian Lingua::Stem::UniNE::BG Image by NuclearVacuum on Wikimedia Commons / CC BY-SA 3.0
Slide 28
Slide 28 text
Lingua::Stem::UniNE::FA Persian Image by Mani1 on Wikimedia Commons / public domain
Slide 29
Slide 29 text
Lingua::Stem::Patch::EO Esperanto Image by Ionut Cojocaru on Wikimedia Commons / CC BY 3.0
Slide 30
Slide 30 text
Lingua::Stem::Patch::IO Ido Image by Ionut Cojocaru on Wikimedia Commons / CC BY 3.0
Slide 31
Slide 31 text
Lingua::Stem::TLH ?! Klingon?! Image by NASA and ESA / public domain
Slide 32
Slide 32 text
TODO pl Polish ar Arabic bn Bengali hi Hindi mr Marathi
Slide 33
Slide 33 text
Nick Patch @nickpatch Shutterstock