Pro Yearly is on sale from $80 to $50! »

Programmatic access for Altmetrics

Programmatic access for Altmetrics

Talk at #plosalm13 in San Francisco, CA (html version: http://bit.ly/roalm)

D9033fa816e09d79e44995e92f025cdd?s=128

Scott Chamberlain

October 16, 2013
Tweet

Transcript

  1. Programmatic access for Altmetrics Scott Chamberlain (@recology_) rOpenSci / Simon

    Fraser University
  2. Find this talk here http://bit.ly/roalm Made with Slidify; the code

    here Press "o" to bring up all slides - "w" to change aspect - "g" to go to page 2/37
  3. Programmatic access to altmetrics Open altmetrics data 3/37

  4. Programmatic access 4/37

  5. Programmatic access to altmetrics Computers are simply better at repetitive

    tasks Makes repetitive tasks take far less time Facilitates tool creation by developers Allows research questions to be addressed more quickly Facilitates reproducibility · · · · 5/37
  6. What is needed for easy programmatic access? 6/37

  7. Modern API technology REST API The modern way to serve

    data to consumers Makes data consumption easy from any programming language Base URI, e.g. http://foo.com Media type, e.g., json, xml HTTP verbs, like GET, POST, PUT, PATCH, HEAD, etc... · · · 7/37
  8. Proper HTTP status codes 1xx - informational 2xx - success

    3xx - redirection 4xx - client error 5xx - server error · · · · · 8/37
  9. 9/37

  10. Good docs (for developers) 10/37

  11. Authentication OAuth makes sense for web workflows, but not so

    much for programmatic workflows. Having both options is nice. 11/37
  12. A spec for REST? RAML - http://raml.org/ Programatically create new

    clients Good place to include altmetrics standards... # % R A M L 0 . 8 – – – t i t l e : W o r l d M u s i c A P I b a s e U r i : h t t p : / / e x a m p l e . a p i . c o m / { v e r s i o n } v e r s i o n : v 1 / s o n g s : g e t : p o s t : . . . 12/37
  13. Deploying APIs is probably hard 13/37

  14. Consuming altmetrics programmatically 14/37

  15. We need altmetrics research 15/37

  16. Programmatic access to altmetrics data key for reproducibility 16/37

  17. Having a look at the literature... Do Altmetrics Work? Twitter

    and Ten Other... - via Altmetric.com Tweeting biomedicine: an analysis of tweets... - via Altmetric.com The Spread of Scientific Information... - via PLOS ALM Can Tweets Predict Citations? ... - via Twitter Search API Altmetrics in the Wild... - via PLOS ALM, various APIs, WebofSci citations Social Media Release Increases Dissemination... - via manual collection Identifying Audiences of E-Infrastructures... - via Google Analytics How the Scientific Community Reacts to... - via Twitter Search API, Google Scholar citations · · · · · · · · 17/37
  18. Most popular programming language? 18/37

  19. Obviously 19/37

  20. 20/37

  21. Many libraries available, but more needed DATA SOURCE LIBRARIES ROPENSCI

    CONTRIBUTIONS IN R PLOS ALM R alm ** Copernicus, etc. ImpactStory R, Javascript rImpactStory Altmetric R, Python, Ruby, iOS rAltmetric 21/37
  22. Interacting with REST APIs in R o u t <

    - G E T ( " h t t p : / / a l m . p l o s . o r g / a p i / v 3 / a r t i c l e s ? d o i = 1 0 . 1 3 7 1 / j o u r n a l . p m e d . 1 0 0 1 3 6 1 & k e y = < k e y > " ) s t o p _ f o r _ s t a t u s ( o u t ) c o n t e n t ( o u t ) { d o i : " 1 0 . 1 3 7 1 / j o u r n a l . p m e d . 1 0 0 1 3 6 1 " , t i t l e : " P e r s o n a l i z e d P r e d i c t i o n o f L i f e t i m e B e n e f i t s w i t h S t a t i n T h e r a p y f o r A s y m p t o m a t i c I n d i v i d u a l s : A M o d e l i n g S t u d y " , u r l : " h t t p : / / w w w . p l o s m e d i c i n e . o r g / a r t i c l e / i n f o % 3 A d o i % 2 F 1 0 . 1 3 7 1 % 2 F j o u r n a l . p m e d . 1 0 0 1 3 6 1 " , m e n d e l e y : " 4 3 7 b 0 7 d 9 - b c 4 0 - 4 c 5 7 - b 6 0 e - 1 f 6 0 f e f e 2 3 0 0 " , p m i d : " 2 3 3 0 0 3 8 8 " , p m c i d : " 3 5 3 1 5 0 1 " , p u b l i c a t i o n _ d a t e : " 2 0 1 2 - 1 2 - 2 7 T 0 8 : 0 0 : 0 0 Z " , u p d a t e _ d a t e : " 2 0 1 3 - 1 0 - 0 7 T 1 1 : 0 6 : 5 8 Z " , v i e w s : 9 3 2 9 , s h a r e s : 6 2 , b o o k m a r k s : 5 , c i t a t i o n s : 1 } 22/37
  23. Data via alm interface to PLOS ALM a l m

    ( d o i = " 1 0 . 1 3 7 1 / j o u r n a l . p o n e . 0 0 2 9 7 9 7 " ) A n o b j e c t o f c l a s s " a l m t o t " S l o t " m e t a " : $ d o i [ 1 ] " 1 0 . 1 3 7 1 / j o u r n a l . p o n e . 0 0 2 9 7 9 7 " . . . < m o r e m e t a d a t a > S l o t " s u m m a r y " : v i e w s s h a r e s b o o k m a r k s c i t a t i o n s 1 2 9 2 2 9 2 3 7 5 1 7 S l o t " d a t a " : . i d p d f h t m l s h a r e s g r o u p s c o m m e n t s l i k e s c i t a t i o n s t o t a l 1 b l o g l i n e s N A N A N A N A N A N A 0 0 2 c i t e u l i k e N A N A 1 N A N A N A N A 1 3 c o n n o t e a N A N A N A N A N A N A 0 0 4 c r o s s r e f N A N A N A N A N A N A 7 7 5 n a t u r e N A N A N A N A N A N A 4 4 . . . 23/37
  24. Combining metrics across aggregators DATA SOURCE PLOS IMPACTSTORY ALTMETRIC WebOfScience

    webofscience ‐‐ ‐‐ Dryad ‐‐ dryad:total_downloads ‐‐ Figshare figshare figshare:views shares downloads ‐‐ Github ‐‐ github:forks stars ‐‐ Google+ ‐‐ ‐‐ cited by gplus count Mendeley readers mendeley shares mendeley readers mendeley readers Twitter twitter topsy:tweets cited by tweeters count 24/37
  25. Proposed R library metaAlm - ( ) Combine altmetrics data

    across providers (ImpactStory, Altmetric, etc.) and across data sources (Twitter, Facebook, etc.) 25/37
  26. Combining metrics Get data from three different providers Easily combine

    data with a single function, and highlight inconsistencies p l o s _ d a t a < - a l m ( < d o i > ) i m p a c t s t o r y _ d a t a < - m e t r i c s ( < d o i > ) a l t m e t r i c _ d a t a < - a l t m e t r i c _ d a t a ( a l t m e t r i c s ( < d o i > ) ) a l t _ c o m b i n e ( p l o s _ d a t a , i m p a c t s t o r y _ d a t a , a l t m e t r i c _ d a t a ) W a r n i n g : I n c o n s i s t e n c y i n f a c e b o o k L i k e s , c h e c k m e t a d a t a d a t a S o u r c e f r o m P r o v i d e r v a l u e s 1 t w i t t e r P L O S A L M 1 0 0 2 f a c e b o o k L i k e s I m p a c t S t o r y 5 0 3 f a c e b o o k L i k e s A l t m e t r i c 4 0 4 s c o p u s C i t a t i o n s A l t m e t r i c 1 5 0 26/37
  27. Example in R Load libraries, get 200 DOIs, get ALM

    data, plot l i b r a r y ( r p l o s ) ; l i b r a r y ( a l m ) ; l i b r a r y ( p l y r ) d o i s < - s e a r c h p l o s ( t e r m s = ' * : * ' , f i e l d s = " i d " , l i m i t = 2 0 0 ) a l m < - l d p l y ( a l m ( d o i = d o . c a l l ( c , d o i s $ i d ) , t o t a l _ d e t a i l s = T R U E ) ) p l o t _ d e n s i t y ( a l m , c ( " c o u n t e r _ p d f " , " m e n d e l e y _ s h a r e s " , " p m c _ p d f " , " p m c _ t o t a l " ) , c ( " # 8 3 D F B 4 " , " # E F A 5 A 5 " , " # C F D 4 7 0 " , " # B 2 C 9 E 4 " ) , p l o t _ t y p e = " h " ) 27/37
  28. 0040117 0039395 0029797 0001543 5,000.0 10,000.0 15,000.0 20,000.0 25,000.0 0.0

    29,208.0 views shares bookmarks citations Grouped Stacked l i b r a r y ( r p l o s ) ; l i b r a r y ( a l m ) ; l i b r a r y ( r C h a r t s ) d o i s < - c ( ' 1 0 . 1 3 7 1 / j o u r n a l . p o n e . 0 0 0 1 5 4 3 ' , ' 1 0 . 1 3 7 1 / j o u r n a l . p o n e . 0 0 4 0 1 1 7 ' , ' 1 0 . 1 3 7 1 / j o u r n a l . p o n e . 0 0 2 9 7 9 7 ' , ' 1 0 . 1 3 7 1 / j o u r n a l . p o n e . 0 0 3 9 3 9 5 ' ) d a t < - s i g n p o s t s ( d o i = d o i s ) p l o t _ s i g n p o s t s ( i n p u t = d a t , t y p e = " m u l t i B a r C h a r t " , h e i g h t = 4 0 0 ) 28/37
  29. 29/37

  30. “I’d argue that #opendata today is exactly where open source

    was some 2 decades ago”-@BenBalter http://t.co/VJ6QiLybUU #oss — Alex Howard (@digiphile) October 9, 2013 30/37
  31. Why is openness a good thing? Altmetrics needs checks on

    Consistency (tweets from source A and B should be =) Correlation (is metric A strongly corr. with B?) Interpretation (open source the interpretation) Gaming (security through obscurity doesn't work) · · · · 31/37
  32. Why is openness a good thing? Altmetrics needs checks on

    Open data makes all this easier Consistency (tweets from source A and B should be =) Correlation (is metric A strongly corr. with B?) Interpretation (open source the interpretation) Gaming (security through obscurity doesn't work) · · · · 32/37
  33. Additional value from openness Knowledge from research findings Open products

    For-profit products Who knows? Making data open allows many experiments, some of which will stick · Doesn't require open data I suppose :(, but helps facilitate research e.g., think how hard text-mining is - we don't want that in altmetrics - - · ReaderMeter ScienceCard - - · · 33/37
  34. An open use case 10.3789/isqv25no2.2013.02 34/37

  35. Programmatic access Open altmetrics data 35/37

  36. Programmatic access open data 36/37

  37. Programmatic access to Open altmetrics data 37/37