Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Zooniverse: Web-Scale Citizen Science

Arfon Smith
October 04, 2012

Zooniverse: Web-Scale Citizen Science

Web-scale citizen science such as Zooniverse (www.zooniverse.org) has provided a temporary solution to the flood of data that confronts researchers of 21st century, however the solution is a short-term one. In this presentation I will outline a potential strategy for combining a large web community and significant compute resources to create a scalable, intelligent classification engine.

Arfon Smith

October 04, 2012
Tweet

More Decks by Arfon Smith

Other Decks in Science

Transcript

  1. Zooniverse:
    Web-Scale Citizen Science
    Arfon Smith
    @arfon

    View Slide

  2. View Slide

  3. 1,000,000,000,000
    1 trillion hours

    View Slide

  4. Spectrum of cognitive surplus

    View Slide

  5. View Slide

  6. http://www.novacelestia.com

    View Slide

  7. View Slide

  8. View Slide

  9. View Slide

  10. View Slide

  11. View Slide

  12. 0
    20
    40
    60
    80
    100
    Professor

    View Slide

  13. 0
    2,500
    5,000
    7,500
    10,000
    Professor Paper

    View Slide

  14. 0
    12,500
    25,000
    37,500
    50,000
    Professor Paper PhD

    View Slide

  15. 0
    250,000
    500,000
    750,000
    1,000,000
    Professor Paper PhD SDSS

    View Slide

  16. View Slide

  17. 680,000
    volunteers

    View Slide

  18. 250,000,000
    analyses

    View Slide

  19. View Slide

  20. View Slide

  21. View Slide

  22. View Slide

  23. View Slide

  24. (it works)

    View Slide

  25. View Slide

  26. View Slide

  27. View Slide

  28. View Slide

  29. View Slide

  30. View Slide

  31. View Slide

  32. View Slide

  33. View Slide

  34. View Slide

  35. View Slide

  36. View Slide

  37. View Slide

  38. View Slide

  39. 0
    25,000,000
    50,000,000
    75,000,000
    100,000,000
    Professor Paper PhD SDSS ?

    View Slide

  40. View Slide

  41. Galaxy Zoo Supernovae

    View Slide

  42. Supernova (3)
    Reference
    New Difference

    View Slide

  43. Possible Supernova (1)
    Reference
    New Difference

    View Slide

  44. Likely Junk (-1)
    Reference
    New Difference

    View Slide

  45. Smith et al. - arXiv:1011.2199

    View Slide

  46. Optimisations
    (potential)

    View Slide

  47. 1. Accuracy/experience of the classifier
    2. Difficulty of the object being classified
    3. Uncertainty of the object’s classification
    4. Difficulty of the task
    5. Understand classifier motivation
    6. Understand current ‘state’ of the classifier

    View Slide

  48. Dynamic Bayesian Combination of
    Multiple Imperfect Classifiers
    Simpson et al. - arXiv:1206.1831

    View Slide

  49. View Slide

  50. View Slide

  51. Classification strategies

    View Slide

  52. View Slide

  53. View Slide

  54. Combining Human and Machine
    Intelligence in Large-scale
    Crowdsourcing
    Kamar, Hacker & Horvitz - http://is.gd/horvitz

    View Slide

  55. View Slide

  56. Motivations

    View Slide

  57. View Slide

  58. View Slide

  59. View Slide

  60. View Slide

  61. View Slide

  62. Reference
    New Difference

    View Slide

  63. What are humans good at?

    View Slide

  64. View Slide

  65. SDSS
    HST
    Starforming pea Narrow-line Seyfert pea

    View Slide

  66. View Slide

  67. View Slide

  68. View Slide

  69. Social Computing Systems

    View Slide

  70. Expensive
    Noisy
    Classifier agents

    View Slide

  71. View Slide

  72. Thanks
    [email protected]
    @arfon

    View Slide