$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
The Great Language Game
Search
Lars Yencken
September 02, 2013
Programming
0
330
The Great Language Game
A brief introduction to the Great Language Game, given to the Melbourne Python User Group.
Lars Yencken
September 02, 2013
Tweet
Share
More Decks by Lars Yencken
See All by Lars Yencken
Linguistics, a whirlwind tour!
larsyencken
0
58
Pycon 2014 Recap
larsyencken
0
66
Nine months of food
larsyencken
0
290
Automation for web development
larsyencken
0
160
Scaling a web stack
larsyencken
4
200
Similarity metrics for Japanese kanji
larsyencken
0
87
Other Decks in Programming
See All in Programming
これならできる!個人開発のすゝめ
tinykitten
PRO
0
130
著者と進める!『AIと個人開発したくなったらまずCursorで要件定義だ!』
yasunacoffee
0
160
AIコーディングエージェント(NotebookLM)
kondai24
0
220
DevFest Android in Korea 2025 - 개발자 커뮤니티를 통해 얻는 가치
wisemuji
0
170
gunshi
kazupon
1
110
AIエージェントの設計で注意するべきポイント6選
har1101
5
2.2k
AtCoder Conference 2025「LLM時代のAHC」
imjk
2
570
AIコーディングエージェント(Gemini)
kondai24
0
270
Giselleで作るAI QAアシスタント 〜 Pull Requestレビューに継続的QAを
codenote
0
290
リリース時」テストから「デイリー実行」へ!開発マネージャが取り組んだ、レガシー自動テストのモダン化戦略
goataka
0
140
Combinatorial Interview Problems with Backtracking Solutions - From Imperative Procedural Programming to Declarative Functional Programming - Part 2
philipschwarz
PRO
0
110
Cap'n Webについて
yusukebe
0
150
Featured
See All Featured
Beyond borders and beyond the search box: How to win the global "messy middle" with AI-driven SEO
davidcarrasco
0
22
Raft: Consensus for Rubyists
vanstee
141
7.3k
The Art of Programming - Codeland 2020
erikaheidi
56
14k
Information Architects: The Missing Link in Design Systems
soysaucechin
0
710
30 Presentation Tips
portentint
PRO
1
170
Balancing Empowerment & Direction
lara
5
820
Money Talks: Using Revenue to Get Sh*t Done
nikkihalliwell
0
120
Game over? The fight for quality and originality in the time of robots
wayneb77
1
66
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
9
1.1k
Build The Right Thing And Hit Your Dates
maggiecrowley
38
3k
How to Align SEO within the Product Triangle To Get Buy-In & Support - #RIMC
aleyda
1
1.3k
Have SEOs Ruined the Internet? - User Awareness of SEO in 2025
akashhashmi
0
190
Transcript
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . The Great Language Game Lars Yencken Melbourne Python User Group September 2, 2013
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . I’m a language geek
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . I’m a human language geek
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . The world has something like 7,000 languages
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . The world has something like 7,000 languages So many!
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . The world has something like 7,000 languages Too many to learn!
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . But... with the help of a lil Python we can at least learn to tell the difference between languages
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Aside: langid.py Distinguish between languages in text form
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Aside: langid.py Distinguish between languages in text form ผู้สื่อข่าวไทยวิเคราะห์นโยบายผู้ขอลี้ภัยพรรคต่างๆ >>> import langid >>> langid.classify(l.encode(’utf8’)) (’th’, 1.0) >>> langid.classify(’¡Venga hombre!’) (’es’, 0.5726778160604622)
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . First attempt: streaming radio ▶ There’s lots of internet radio out there! ▶ But it’s all in shitty old formats ▶ And Python support for decoding them all is not great ▶ Solution: sh module and mplayer ▶ Still too hard!
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Second attempt: scrape SBS ▶ Podcasts News podcasts in about 70 languages ▶ Good quality recordings! ▶ (Sometimes) daggy Australian accents ▶ Fetching: pyquery, requests and parse ▶ Processing audio: wave + sh wrapping avconv and mp3gain ▶ Success!
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Aside: sh Wraps shell calls like a boss! >>> from sh import ffmpeg >>> ffmpeg(’-i’, input_file, output_file) >>> from sh import mp3gain >>> mp3gain(’-r’, ’-k’, ’-t’, ’-s’, ’r’, sound_file)
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . More about languages ▶ Wikipedia: manual data entry ▶ Freebase API: via requests
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . End result: demo time!
. . . .. . . . .. . .
. .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Thanks http://greatlanguagegame.com/