Altmetrics at Mendeley

Altmetrics at Mendeley William Gunn, Ph.D. Head of Academic Outreach
Mendeley @mrgunn

Two audiences • The information science community – What we
know & what we’re still trying to understand – What we think important questions are • The altmetrics community – Where Mendeley is going – What we think are the important things to address

What we think we know • Where we are: discovery,
but not assessment – we can describe, but not predict • What it means to correlate with citations • What we’re really measuring – attention – who listens to you and who do you listen to • minus the people who are listened to, but don’t listen well • Lit derived metrics are not enough!

 Amgen: 47 of 53 “landmark” oncology publications could not
be reproduced  Bayer: 43 of 67 oncology & cardiovascular projects were based on contradictory results  Dr. John Ioannidis: 432 publications purporting sex differences in hypertension, multiple sclerosis, or lung cancer. Only one data set was reproducible There is no gold standard

We didn’t see that a target is more likely to
be validated if it was reported in ten publications or in two publications NATURE REVIEWS DRUG DISCOVERY 10, 712 (SEPTEMBER 2011)

Either the results were reproducible and showed transferability in other
models, or even a 1:1 reproduction of published experimental procedures revealed inconsistencies between published and in-house data NATURE REVIEWS DRUG DISCOVERY 10, 712 (SEPTEMBER 2011)

Building a reproducibility dataset • Mendeley and Science Exchange have
started the Reproducibility Initiative • $1.3M grant from LJAF to Initiative via Center for Open Science • 50 most highly cited & read papers from 2010, 2011, and 2012 will be replicated • Figshare & PLOS to host data & replication reports

What we don’t know • What we can predict –
need to understand intent, imported or derived reputation • How to capture all mentions, even without direct identifiers – what skew is there, and what does it mean • How to adjust for regional or cultural differences

Cultural skew is important South America is weak on N.A.
social media, strong on Mendeley

How to understand sources of variability • Collect the same
set of metrics at different times, by different people, using different methods • This will inform the standards process & assist IS people with capturing provenance, doing preservation, and giving advice

What are the important questions we aren’t asking yet? •
let’s get past the “ranking people by their Twitter followers” stuff • Tell us what we should be looking at and how you would like to be involved

What people want to know about Mendeley • We realize
what we do makes a big difference – RG/Academia began to do more once we showed the potential – Researchers value our coverage and source neutrality – Many consume our data, even when it’s crappy

Focusing on recommendations • Mendeley Suggest – personalized recommendations based
on reading history • related articles – relatedness based on document similarity • recommender frameworks – implement recommendations as a service • third-party recommender services – serve niche audiences

improving data quality • Research Catalog v2 – better duplicate
detection – readership numbers stable • only increase – canonical docs • API v2 – exposing more information • annotations • other events (what do you want to see?)

Stability and Security • We are serious – adapting to
and promoting changes in practice • investing in building relationships with developers • platform, not a silo

TEAM Project academic knowledge management solutions • Algorithms to determine
the content similarity of academic papers • Performing text disambiguation and entity recognition to differentiate between and relate similar in-text entities and authors of research papers. • Developing semantic technologies and semantic web languages with the focus of metadata integration/validation • Investigate profiling and user analysis technologies, e.g. based on search logs and document interaction. • We will also improve folksonomies and through that, ontologies of text. • Finally, tagging behaviour will be analysed to improve tag recommendations and strategies. • http://team-project.tugraz.at/blog/

Code Project Use case = mining research papers for facts
to add to LOD repositories and light-weight ontologies. • Crowd-sourcing enabled semantic enrichment & integration techniques for integrating facts contained in unstructured information into the LOD cloud • Federated, provenance-enabled querying methods for fact discovery in LOD repositories • Web-based visual analysis interfaces to support human based analysis, integration and organisation of facts • http://code-research.eu/

Semantics vs. Syntax • Language expresses semantics via syntax •
Syntax is all a computer sees in a research article. • How do we get to semantics? •Topic Modeling!

Distribution of Topics 0% 5% 10% 15% 20% 25% 30%
35%

Subcategories of Comp. Sci. 0% 5% 10% 15% 20% AI
HCI Info Sci Software Eng Networks

www.mendeley.com [email protected] @mrgunn

Altmetrics at Mendeley

Altmetrics at Mendeley

William Gunn

More Decks by William Gunn

Other Decks in Research

Featured

Transcript

Altmetrics at Mendeley William Gunn, Ph.D. Head of Academic Outreach

Two audiences • The information science community – What we

What we think we know • Where we are: discovery,

 Amgen: 47 of 53 “landmark” oncology publications could not

We didn’t see that a target is more likely to

Either the results were reproducible and showed transferability in other

Building a reproducibility dataset • Mendeley and Science Exchange have

What we don’t know • What we can predict –

Cultural skew is important South America is weak on N.A.

How to understand sources of variability • Collect the same

What are the important questions we aren’t asking yet? •

What people want to know about Mendeley • We realize

Focusing on recommendations • Mendeley Suggest – personalized recommendations based

improving data quality • Research Catalog v2 – better duplicate

Stability and Security • We are serious – adapting to

TEAM Project academic knowledge management solutions • Algorithms to determine

Code Project Use case = mining research papers for facts

Semantics vs. Syntax • Language expresses semantics via syntax •

Distribution of Topics 0% 5% 10% 15% 20% 25% 30%

Subcategories of Comp. Sci. 0% 5% 10% 15% 20% AI

www.mendeley.com [email protected] @mrgunn