Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Detecting Communities in Science Blogs

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for cpikas cpikas
December 10, 2008

Detecting Communities in Science Blogs

Presentation given to 2008 IEEE E-Science Conference

Avatar for cpikas

cpikas

December 10, 2008
Tweet

More Decks by cpikas

Other Decks in Science

Transcript

  1. Problem • eScience includes using electron science and for communicating

    a g • There are an abundance of tools to help scientists communicate to help scientists communicate • Lots of scientists and members o i t i bl ( 2500?) maintain blogs (~2500?) • Ultimate Questions: Why? With whom are scientists commu What are scientists communicati What are scientists communicati What is the value to the scientist m Area nic tools both for conducting about science s both online and offline of the interested public unicating? ing about? ing about? ts and to science?
  2. Specific Proble • What is the nature o science blogosphere

    g p – What is its shape? Who are the central p – Who are the central p – What is the connectiv – Where are the potent em Addressed f the e? participants? participants? vity? tial information flows?
  3. Out • Background Background • Methods – Data gathering A

    l i – Analysis • Results • Results • Discussion Discussion line
  4. Backgrou • Defined by format Defined by format – Individual

    posts, with Comments – Comments • Links Links – In content I bl ll – In blogroll – In comments and trac • Community develops d bl th and among blogs thro nd: Blogs permanent URLs ckbacks around single blogs h ti ough commenting
  5. Links to Static Pages Posts Link auto osts gen cont

    http://dorigo.word ks and omatically t d erated tent press.com/
  6. Access to posts by search Access to posts by search

    and older posts using the calendar A li t f t t t A list of most recent posts is automatically generated
  7. A list of categories the blogger used to describe his

    posts used to describe his posts. Clicking will list all of the posts in that category. The blogroll is a list of blogs the author reads or endorses the author reads or endorses to some extent. Access to the older posts by month.
  8. And a form to leave your ow comment. Typically your

    e-m will not appear on the site But with Comments, which may be signed with the y g the commenter’s URL n mail
  9. Background: Socia •Uses connections bet understand potential p and influence

    •Uses graph theoretic – Central or prestigious Central or prestigious – Cohesive subgroups al Network Analysis tween actors to flows of information methods to find s actors s actors including communities
  10. Methods: Sam Operational Definitio Operational Definitio • Blogs maintained by

    sc t f b i any aspect of being a s • Blogs about scientific to Blogs about scientific to Omitted Omitted • Primarily political speec • Ones maintained by co • Non-English language mple Selection n of Science Blog n of Science Blog cientists that deal with i ti t scientist opics by non-scientists opics by non scientists ch rporations
  11. Methods: Da • Two Networks: Link • Link Data (Blogroll)

    – Used seed list developed Used seed list developed using directories and sea – Snowball sampled using p g – Visited and copied links • Commenter Data – Selected most central blo – Used Perl scripts to pull t from each of the last 10 p ata Gathering ks and Commenters ) d in previous study d in previous study arches links from blogrolls g ogs from blogroll data the commenter URLs posts
  12. Methods: U d i l t k •Used social network

    a and graphing software •Examined graph and descriptive statistics descriptive statistics •Found centrality and p y p –Degree: the links in an Betweenness: the num –Betweenness: the num that flow through that n Closeness: short paths –Closeness: short paths Analysis l i analysis e calculated basic prestige measures p g nd out mber of shortest paths mber of shortest paths node s to other nodes s to other nodes
  13. Methods: Located cohesive su • Link methods • Link methods

    – Components LS S t – LS Sets • Clustering methods g • Community detection te – Newman-Girvan – Spin Glass Analysis ubgroups echniques
  14. Results: Link An •One large component •There were 1091 node

    •Diameter is 9 •In-degree ranges from median in-degree of 3 median in-degree of 3, – 10 of the top 20 blogs b or co-authored by wome – 4 of the top 5 blogs by c p g y co-authored by women nalysis (Blogroll) es, 6621 arcs 1 to 292, with the and mean 6 and mean 6 y in-degree are authored en closeness are authored or
  15. Results: C •5 components, the larg others with 11 or

    fewer •938 nodes (starting wit •The largest component Commenter gest with 911, r nodes h the 46), 1152 arcs t has a diameter of 5
  16. Discussion: Li • Most of the blogs we dense component

    p – A result of the diffus • There were a few ve then many less cent then many less cent – Typical skewed dist • The community of w merits further study merits further study inks (Blogroll) ere connected in one sion of blogs? ery central blogs, and ral ral tribution women scientists
  17. Discussion: C • Analysis easily locate commenter who leav comments

    on physics – High out-degree no – High out-degree, no • Traffic on the women Traffic on the women uniform, with frequen widely distributed am widely distributed am – Indicates a different Commenters ed a notorious es incendiary y s and chemistry blogs links in links in n scientist blogs is more n scientist blogs is more nt comments that are mong the blogs mong the blogs use
  18. Take Home • The science blogosp • The science blogosp

    connected with many f f influence and informa • Communities tend to • Communities tend to disciplinary boundari • An exception is the c women scientist blog women scientist blog from many different d e Messages phere is densely phere is densely y opportunities for ff ation diffusion o form within o form within es community of ggers who are ggers who are disciplines
  19. Acknowle • Thanks to Dr. Jen G this work as

    part of a p • Thanks also to – Dr. Alan Neustadtl fo – Dr. Dagobert Soerge Dr. Dagobert Soerge dgements olbeck for supervising an independent study p y r SNA advice l for research advice l for research advice
  20. Christina K. Pikas Doctoral Student U i it f M

    l d University of Maryland College of Information College of Information [email protected] http://terpconnect.umd.edu/ Studies Studies /~cpikas/ScienceBlogging