Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Altmetrics at Mendeley

William Gunn
November 05, 2013

Altmetrics at Mendeley

This presentation was given at the 2013 ASIST meeting in Montreal and addresses recent updates in altmetrics from an information science perspective, as well as what Mendeley's doing. Co-panelists were Stefanie Haustein, Jennifer Lin, Judit Bar-Ilan, and Stacy Konkiel.

William Gunn

November 05, 2013
Tweet

More Decks by William Gunn

Other Decks in Research

Transcript

  1. Altmetrics at Mendeley
    William Gunn, Ph.D.
    Head of Academic Outreach
    Mendeley
    @mrgunn

    View Slide

  2. Two audiences
    • The information science community
    – What we know & what we’re still trying to
    understand
    – What we think important questions are
    • The altmetrics community
    – Where Mendeley is going
    – What we think are the important things to
    address

    View Slide

  3. What we think we know
    • Where we are: discovery, but not
    assessment – we can describe, but not
    predict
    • What it means to correlate with citations
    • What we’re really measuring – attention
    – who listens to you and who do you listen to
    • minus the people who are listened to, but don’t
    listen well
    • Lit derived metrics are not enough!

    View Slide

  4.  Amgen: 47 of 53 “landmark” oncology publications could
    not be reproduced
     Bayer: 43 of 67 oncology & cardiovascular projects were
    based on contradictory results
     Dr. John Ioannidis: 432 publications purporting sex
    differences in hypertension, multiple sclerosis, or lung
    cancer. Only one data set was reproducible
    There is no gold standard

    View Slide

  5. We didn’t see that a target is
    more likely to be validated if it
    was reported in ten publications
    or in two publications
    NATURE REVIEWS DRUG DISCOVERY 10, 712 (SEPTEMBER 2011)

    View Slide

  6. Either the results were reproducible
    and showed transferability in other
    models, or even a 1:1 reproduction of
    published experimental procedures
    revealed inconsistencies between
    published and in-house data
    NATURE REVIEWS DRUG DISCOVERY 10, 712 (SEPTEMBER 2011)

    View Slide

  7. Building a reproducibility dataset
    • Mendeley and Science Exchange have
    started the Reproducibility Initiative
    • $1.3M grant from LJAF to Initiative via
    Center for Open Science
    • 50 most highly cited & read papers from
    2010, 2011, and 2012 will be replicated
    • Figshare & PLOS to host data & replication
    reports

    View Slide

  8. What we don’t know
    • What we can predict
    – need to understand intent, imported or
    derived reputation
    • How to capture all mentions, even without
    direct identifiers
    – what skew is there, and what does it mean
    • How to adjust for regional or cultural
    differences

    View Slide

  9. Cultural skew is important
    South America is weak on N.A. social media, strong on Mendeley

    View Slide

  10. How to understand sources of
    variability
    • Collect the same set of metrics at different
    times, by different people, using different
    methods
    • This will inform the standards process &
    assist IS people with capturing
    provenance, doing preservation, and
    giving advice

    View Slide

  11. What are the important questions
    we aren’t asking yet?
    • let’s get past the “ranking people by their
    Twitter followers” stuff
    • Tell us what we should be looking at and
    how you would like to be involved

    View Slide

  12. What people want to know about
    Mendeley
    • We realize what we do makes a big
    difference
    – RG/Academia began to do more once we
    showed the potential
    – Researchers value our coverage and source
    neutrality
    – Many consume our data, even when it’s
    crappy

    View Slide

  13. Focusing on recommendations
    • Mendeley Suggest
    – personalized recommendations based on
    reading history
    • related articles
    – relatedness based on document similarity
    • recommender frameworks
    – implement recommendations as a service
    • third-party recommender services
    – serve niche audiences

    View Slide

  14. improving data quality
    • Research Catalog v2
    – better duplicate detection
    – readership numbers stable
    • only increase
    – canonical docs
    • API v2
    – exposing more information
    • annotations
    • other events (what do you want to see?)

    View Slide

  15. Stability and Security
    • We are serious
    – adapting to and promoting changes in
    practice
    • investing in building relationships with
    developers
    • platform, not a silo

    View Slide

  16. TEAM Project
    academic knowledge management
    solutions
    • Algorithms to determine the content similarity of academic papers
    • Performing text disambiguation and entity recognition to
    differentiate between and relate similar in-text entities and authors
    of research papers.
    • Developing semantic technologies and semantic web languages with
    the focus of metadata integration/validation
    • Investigate profiling and user analysis technologies, e.g. based on
    search logs and document interaction.
    • We will also improve folksonomies and through that, ontologies of
    text.
    • Finally, tagging behaviour will be analysed to improve tag
    recommendations and strategies.
    • http://team-project.tugraz.at/blog/

    View Slide

  17. Code Project
    Use case = mining research papers for facts
    to add to LOD repositories and light-weight
    ontologies.
    • Crowd-sourcing enabled semantic enrichment & integration
    techniques for integrating facts contained in unstructured
    information into the LOD cloud
    • Federated, provenance-enabled querying methods for fact
    discovery in LOD repositories
    • Web-based visual analysis interfaces to support human based
    analysis, integration and organisation of facts
    • http://code-research.eu/

    View Slide

  18. View Slide

  19. View Slide

  20. View Slide

  21. Semantics vs. Syntax
    • Language expresses semantics via
    syntax
    • Syntax is all a computer sees in a
    research article.
    • How do we get to semantics?
    •Topic Modeling!

    View Slide

  22. Distribution of Topics
    0%
    5%
    10%
    15%
    20%
    25%
    30%
    35%

    View Slide

  23. Subcategories of Comp. Sci.
    0%
    5%
    10%
    15%
    20%
    AI HCI Info Sci Software
    Eng
    Networks

    View Slide

  24. www.mendeley.com
    [email protected]
    @mrgunn

    View Slide