Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
The Great Language Game
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
Lars Yencken
September 02, 2013
Programming
0
340
The Great Language Game
A brief introduction to the Great Language Game, given to the Melbourne Python User Group.
Lars Yencken
September 02, 2013
Tweet
Share
More Decks by Lars Yencken
See All by Lars Yencken
Linguistics, a whirlwind tour!
larsyencken
0
61
Pycon 2014 Recap
larsyencken
0
72
Nine months of food
larsyencken
0
290
Automation for web development
larsyencken
0
160
Scaling a web stack
larsyencken
4
200
Similarity metrics for Japanese kanji
larsyencken
0
87
Other Decks in Programming
See All in Programming
JPUG勉強会 OSSデータベースの内部構造を理解しよう
oga5
2
230
エラーログのマスキングの仕組みづくりに役立ったASTの話
kumoichi
0
110
Claude Code、ちょっとした工夫で開発体験が変わる
tigertora7571
0
200
CDIの誤解しがちな仕様とその対処TIPS
futokiyo
0
170
Go1.26 go fixをプロダクトに適用して困ったこと
kurakura0916
0
330
Ruby x Terminal
a_matsuda
7
580
API Platformを活用したPHPによる本格的なWeb API開発 / api-platform-book-intro
ttskch
1
120
朝日新聞のデジタル版を支えるGoバックエンド ー価値ある情報をいち早く確実にお届けするために
junkiishida
1
370
CSC307 Lecture 12
javiergs
PRO
0
460
LangChain4jとは一味違うLangChain4j-CDI
kazumura
1
150
Event Storming
hschwentner
3
1.3k
AWS×クラウドネイティブソフトウェア設計 / AWS x Cloud-Native Software Design
nrslib
4
380
Featured
See All Featured
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
38
2.8k
SEO in 2025: How to Prepare for the Future of Search
ipullrank
3
3.3k
AI: The stuff that nobody shows you
jnunemaker
PRO
3
350
StorybookのUI Testing Handbookを読んだ
zakiyama
31
6.6k
Information Architects: The Missing Link in Design Systems
soysaucechin
0
810
For a Future-Friendly Web
brad_frost
183
10k
Balancing Empowerment & Direction
lara
5
930
Design in an AI World
tapps
0
160
Winning Ecommerce Organic Search in an AI Era - #searchnstuff2025
aleyda
1
1.9k
So, you think you're a good person
axbom
PRO
2
1.9k
世界の人気アプリ100個を分析して見えたペイウォール設計の心得
akihiro_kokubo
PRO
67
37k
Exploring the relationship between traditional SERPs and Gen AI search
raygrieselhuber
PRO
2
3.7k
Transcript
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . The Great Language Game Lars Yencken Melbourne Python User Group September 2, 2013
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . I’m a language geek
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . I’m a human language geek
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . The world has something like 7,000 languages
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . The world has something like 7,000 languages So many!
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . The world has something like 7,000 languages Too many to learn!
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . But... with the help of a lil Python we can at least learn to tell the difference between languages
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Aside: langid.py Distinguish between languages in text form
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Aside: langid.py Distinguish between languages in text form ผู้สื่อข่าวไทยวิเคราะห์นโยบายผู้ขอลี้ภัยพรรคต่างๆ >>> import langid >>> langid.classify(l.encode(’utf8’)) (’th’, 1.0) >>> langid.classify(’¡Venga hombre!’) (’es’, 0.5726778160604622)
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . First attempt: streaming radio ▶ There’s lots of internet radio out there! ▶ But it’s all in shitty old formats ▶ And Python support for decoding them all is not great ▶ Solution: sh module and mplayer ▶ Still too hard!
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Second attempt: scrape SBS ▶ Podcasts News podcasts in about 70 languages ▶ Good quality recordings! ▶ (Sometimes) daggy Australian accents ▶ Fetching: pyquery, requests and parse ▶ Processing audio: wave + sh wrapping avconv and mp3gain ▶ Success!
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Aside: sh Wraps shell calls like a boss! >>> from sh import ffmpeg >>> ffmpeg(’-i’, input_file, output_file) >>> from sh import mp3gain >>> mp3gain(’-r’, ’-k’, ’-t’, ’-s’, ’r’, sound_file)
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . More about languages ▶ Wikipedia: manual data entry ▶ Freebase API: via requests
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . End result: demo time!
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Thanks http://greatlanguagegame.com/