$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Measuring Quality Content
Search
Adam Hyland
August 04, 2012
Research
2
79
Measuring Quality Content
Presentation to Wikimania 2012 on Article Feedback Tool statistics.
Adam Hyland
August 04, 2012
Tweet
Share
More Decks by Adam Hyland
See All by Adam Hyland
Here Comes (a significant fraction of) Everybody
protonk
0
78
Boston Data Swap: Data Vis Under Uncertainty
protonk
0
56
Why Nate Silver is Famous
protonk
1
120
Data Visualization under Uncertainty
protonk
0
770
Phillips Academy Wikipedia Introduction
protonk
0
90
Other Decks in Research
See All in Research
"主観で終わらせない"定性データ活用 ― プロダクトディスカバリーを加速させるインサイトマネジメント / Utilizing qualitative data that "doesn't end with subjectivity" - Insight management that accelerates product discovery
kaminashi
15
17k
さまざまなAgent FrameworkとAIエージェントの評価
ymd65536
1
360
Self-Hosted WebAssembly Runtime for Runtime-Neutral Checkpoint/Restore in Edge–Cloud Continuum
chikuwait
0
200
令和最新技術で伝統掲示板を再構築: HonoX で作る型安全なスレッドフロート型掲示板 / かろっく@calloc134 - Hono Conference 2025
calloc134
0
450
Multi-Agent Large Language Models for Code Intelligence: Opportunities, Challenges, and Research Directions
fatemeh_fard
0
120
機械学習と数理最適化の融合 (MOAI) による革新
mickey_kubo
1
440
Stealing LUKS Keys via TPM and UUID Spoofing in 10 Minutes - BSides 2025
anykeyshik
0
170
LLM-jp-3 and beyond: Training Large Language Models
odashi
1
720
AIスパコン「さくらONE」の オブザーバビリティ / Observability for AI Supercomputer SAKURAONE
yuukit
2
1k
論文紹介: ReGenesis: LLMs can Grow into Reasoning Generalists via Self-Improvement
hisaokatsumi
0
150
Neural Spatial Audio Processing for Sound Field Analysis and Control
skoyamalab
0
130
その推薦システムの評価指標、ユーザーの感覚とズレてるかも
kuri8ive
1
290
Featured
See All Featured
Stop Working from a Prison Cell
hatefulcrawdad
273
21k
Ruling the World: When Life Gets Gamed
codingconduct
0
94
The Mindset for Success: Future Career Progression
greggifford
PRO
0
190
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
9
1k
Jess Joyce - The Pitfalls of Following Frameworks
techseoconnect
PRO
1
25
Understanding Cognitive Biases in Performance Measurement
bluesmoon
32
2.8k
Redefining SEO in the New Era of Traffic Generation
szymonslowik
1
160
Effective software design: The role of men in debugging patriarchy in IT @ Voxxed Days AMS
baasie
0
170
Stewardship and Sustainability of Urban and Community Forests
pwiseman
0
70
GraphQLとの向き合い方2022年版
quramy
50
14k
Bridging the Design Gap: How Collaborative Modelling removes blockers to flow between stakeholders and teams @FastFlow conf
baasie
0
400
Evolving SEO for Evolving Search Engines
ryanjones
0
73
Transcript
Measuring Article Quality Peer Review and the Article Feedback Tool
Adam Hyland protonk @ en-wp
Look Familiar?
Maybe This Version?
None
Article Feedback Tool • Deployed in 2010 • Version 4
(the current version) ramped up in 2011 • Designed to offer an avenue for reader feedback • High volume of reader feedback
• 6 months of public data • 795,353 articles --
2,487,522 responses
Featured Articles (FA) • 3,599 articles (0.09% of all articles)
• 2,267 Featured Lists (FL) • Most rigorous peer review process on the English Wikipedia • Very sensitive to editor preferences • Some idiosyncrasies
Good Articles (GA) • 15,357 articles • Relatively rigorous peer
review (yes I know reasonable minds may disagree) • Less idiosyncratic than FA in some ways • Perhaps less dependent on editor preference
Data • Article name • Length (in bytes) • GA/FA
status (including former/not- promoted) • Some user data
None
Beyond Summaries • Reader ratings follow pageviews • Predominantly non-editors
• Popular articles: • Call of Duty • Justin Bieber • Jimmy Wales (avg. rating: 1.10585)
Power Laws Everywhere!
Classical(ish) Models • Logistic regression model supports a relationship between
rating and likelihood of FA/GA • Linear model does, but with a twist • Can’t escape Cambridge Endogeneity Police!
None
Data Mining • Predicting featured status from reader ratings and
minimal meta-data. • Bayesian classifier able to roughly predict featured status (with a high false positive rate)
But the system’s changing! • AFT v4 is a multi-category
quantitative measure • AFT v5 is, roughly, YES/NO • Is this a problem? • Frank Harrell and the perils of dichotomization.
Actual Reader Ratings
Another Look
For the skeptics
Information • We can imagine we might not lose information
in shifting to v5 • This is born out by the classifier, to some degree. • We don’t lose a lot of power when dichotomizing individual ratings
A Look Ahead • Really exciting! • Great compliment to
current research methods • Long exposures can help discover reader/editor divergence • Predictive analytics • Need more open data
Questions? • Of course you have questions! • All work
is or soon will be available on github under a free license • Full writeup on en-wp forthcoming