How consistent are altmetrics providers? Study of 1000 PLOS ONE publications using the PLOS ALM, Mendeley and APIs ! ! Zohreh Zahedi1, Martin Fenner2 & Rodrigo Costas3 ! 1,3 Centre for Science & Technology Studies (CWTS), Leiden, The Netherlands 2 Public Library of Science (PLOS), San Francisco, USA

3 Metrics Any metric we use should have good reliability (consistency) and validity.

3 article-level metrics. citeulike scopus ploscounter pmc facebook mendeley twitter 0 10 20 30 40 50 Value doi 50 100 150 20 5 40 10 15 20 0 2 4 6 8 500 0 0 0 1000 5000 1500 10000 15000 20000 Consistency between 7 metrics for 20 DOIs from Altmetric, ImpactStory, PLOS ALM and Plum Analytics! Chamberlain, S. (2013). Consuming Article-Level Metrics: Observations and Lessons. Information Standards Quarterly, 25(2), 4.

3 Methodology 1000 random DOIs out of all 31,408 PLOS ONE articles published in 2013 February 11, 2014 11 AM CET Mendeley API API PLOS ALM API Mendeley Twitter Facebook

3 Results: Coverage Mendeley Twitter Facebook 0 180 360 540 720 900 900 210 490 460 261 325 588 PLOS ALM Mendeley Articles with at least one event out of 1,000 random 2013 PLOS ONE DOIs.

3 Results: Total events Mendeley Twitter Facebook 0 2000 4000 6000 8000 10000 12000 5.330 615 4.129 2.734 10.789 2.484 2.204 PLOS ALM Mendeley Total number of events for 1,000 random 2013 PLOS ONE DOIs.

Twitter  &  Facebook  collection  at •  collects  and  may  merge  tweets  linking   to:   – PubMed  abstracts  or  PubMed  Central  full  text   – Institutional  repositories   • We  only  collect  public  Facebook  wall  posts.  Why?   – No  easy  way  to  show  /  audit  Likes  and  private  posts   – It’s  difficult  to  collect  Likes  at  scale  (3M  papers+)   • Another  issue:  article  mentions  on  Facebook  can  be   in  many  forms  e.g.  link  in  image  caption,  shared  link,   link  in  text  of  wall  post… Slide provided by Euan Adie from

Issues with Mendeley data consistency • last date of data collection • re-clustering of crowdsourced data • identifier used for API call 
 (DOI, PMID or Mendeley UUID) 8

3 Articles found in Mendeley by identifier DOI PMID 0% 20% 40% 60% 80% 100% 88% 100% Data from 1,000 random 2013 PLOS ONE DOIs, collected June 23, 2014.

3 Facebook APIs Public wall posts text for post username for post ! no likes, shares no private activity ! ! link_stat Comments, likes, shares, total ! private and public activity no content or username understands DOIs ! PLOS ALM !

3 Facebook can’t resolve all DOIs to a canonical URL yes! 87 % no! 13 % Data from 9,969 random CrossRef DOIs from 2011 and 2012, collected June 22, 2014. Facebook Debugger:! DOIs: (2011) and 10.6084/M9.FIGSHARE.821213 (2012) Cookies! Circular redirects! Permissions! Canonical URL mismatch

Search API Streaming API Public streams Site streams Third party commercial services such as Datasift 12 3 Twitter APIs

3 1 5 10 50 100 500 1000 1 5 10 50 100 500 1000 5000 Tweet Counts PLOS ALM vs. Number of tweets for 1,000 random 2013 PLOS ONE articles reported by PLOS ALM and Data collected February 11, 2014. PLOS ALM R2 = 0.997

3 Conclusions We have a problem with data consistency that needs to be solved before we can use the data properly ! We need to solve this problem as a community, and there are clear actions we can take

3 Standards and best practices Example ! The Mendeley count is the number of Mendeley readers returned by the Mendeley API. The identifier used to query the Mendeley API should be provided, and should be a DOI where available. The date and time of data collection should be provided and should not be older than a month.

• by an independent organization • regular (at least yearly) • data, report and tools openly available • specific recommendations Audits, ringversuche and open data 16 10,000 random CrossRef DOIs from 2011 and 2012 at 10.6084/M9.FIGSHARE.821209 (2011) and M9.FIGSHARE.821213 (2012) and at! ! ! !!