Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Wiki[mp]edia data sources & the MediaWiki API
Search
Brianna Laugher
November 09, 2009
Technology
0
200
Wiki[mp]edia data sources & the MediaWiki API
Presented at Melbourne Hack Weekend, 2009.
Brianna Laugher
November 09, 2009
Tweet
Share
More Decks by Brianna Laugher
See All by Brianna Laugher
Realities of open source testing: Lessons learned from Adopt Pytest Month
pfctdayelise
0
210
Crowd funded free software
pfctdayelise
0
180
Dynamic visualisation in the IPython Notebook
pfctdayelise
0
240
Funcargs and other fun with pytest
pfctdayelise
0
260
Zookeepr: home grown conference management software
pfctdayelise
0
170
Why "gender" should be a text field
pfctdayelise
0
230
Distributed wikis
pfctdayelise
0
170
Neurosexism
pfctdayelise
0
290
Clash of the encyclopedias: is competition good for sharing?
pfctdayelise
0
170
Other Decks in Technology
See All in Technology
AWS IoT 超入門 2025
hattori
0
340
プレーリーカードを活用しよう❗❗デジタル名刺交換からはじまるイベント会場交流のススメ
tsukaman
0
150
アイテムレビュー機能導入からの学びと改善
zozotech
PRO
0
150
【Kaigi on Rails 事後勉強会LT】MeはどうしてGirlsに? 私とRubyを繋いだRail(s)
joyfrommasara
0
260
LLMアプリの地上戦開発計画と運用実践 / 2025.10.15 GPU UNITE 2025
smiyawaki0820
1
550
小学4年生夏休みの自由研究「ぼくと Copilot エージェント」
taichinakamura
0
700
incident_commander_demaecan__1_.pdf
demaecan
0
120
名刺メーカーDevグループ 紹介資料
sansan33
PRO
0
930
[Codex Meetup Japan #1] Codex-Powered Mobile Apps Development
korodroid
2
750
大規模サーバーレスAPIの堅牢性・信頼性設計 〜AWSのベストプラクティスから始まる現実的制約との向き合い方〜
maimyyym
9
4.6k
Contract One Engineering Unit 紹介資料
sansan33
PRO
0
8.8k
Vibe Coding Year in Review. From Karpathy to Real-World Agents by Niels Rolland, CEO Paatch
vcoisne
0
140
Featured
See All Featured
Intergalactic Javascript Robots from Outer Space
tanoku
273
27k
Gamification - CAS2011
davidbonilla
81
5.5k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.2k
[RailsConf 2023] Rails as a piece of cake
palkan
57
5.9k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
jQuery: Nuts, Bolts and Bling
dougneiner
65
7.9k
KATA
mclloyd
32
15k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
1.7k
Become a Pro
speakerdeck
PRO
29
5.5k
Learning to Love Humans: Emotional Interface Design
aarron
274
41k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
230
22k
Rails Girls Zürich Keynote
gr2m
95
14k
Transcript
Wiki[mp]edia data sources & the MediaWiki API Brianna Laugher for
#melhack November 2009
...
Wikipedia 13M articles total 3M+ articles in English 240+ languages
Simple English!
{{coord|37|48|49|S|144|57| 47|E|type:city_region:AU-VIC| display=inline,title}} stable.toolserver.org/geohack/ wiki.toolserver.org/view/GeoHack
{{Infobox Company |name = Lonely Planet |logo = |type =
[[United Kingdom|British]] [[Government-owned company|government-owned]] (subsidiary of [[BBC Worldwide]]) |genre = [[Guide book|Travel guides]] |foundation = 1972 |founder = Tony Wheeler<br /> Maureen Wheeler |location_city = [[Footscray, Victoria]] |location_country = [[Australia]] |location = |origins = |key_people = Matt Goldberg <small>(Global [[CEO]])</small> |area_served = Worldwide |industry = [[Multi media]] |products = Travel [[guidebook, digital applications, online travel community]] |services =
Wikimedia Commons commons.wikimedia.org Multilingual 5M+ files “Self-created”, PD, Flickr Predominantly
photographs, but also diagrams, maps, flags
None
Wiktionary 5M+ entries 170+ languages 13 languages > 100K entries
French biggest at 1.5M (English second at 1.4M)
JavaScript Wiktionary lookup plugin for third parties: http://bawolff.blogspot.com/2009/10/introducing- wiktionary-lookup-now-for.html http://en.wiktionary.org/wiki/Wiktionary:Parsing
Users Logs Pages, subpages, talk pages
Links, backlinks Templates Categories MediaWiki structure
MediaWiki markup The only thing that completely understands it is
MediaWiki :(
XML download.wikimedia.org OR Amazon Public Data Sets meta.wikimedia.org/wiki/ Data_dumps Database
dumps
DBpedia Community project extracting structured data from Wikipedia and making
it available Can download data sets or query them online Ontology++ e.g. dbpedia.org/page/Lonely_Planet
MediaWiki API mediawiki.org/wiki/API en.wikipedia.org/w/api.php Client libraries!
mwclient Python library for accessing MediaWiki APIs
None
toolserver.org Server for community-developed plugins, addons, extensions, stats and hacks
– tools Tools often explicitly implements implicit editing community standards (“community API”) Toolserver
TemplateTiger toolserver.org/~kolossos/templatetiger/ For a few dozen Wikipedia languages, & Wikimedia
Commons Lets you query templates very much like SQL
identi.ca/pfctdayelise
[email protected]
Thanks! Logos and screenshots may be copyright their
respective owners Slides are otherwise © Brianna Laugher