Zooniverse: Web-Scale Citizen Science

03e2e7de45b193cac192ae7ea071e5ff?s=47 Arfon Smith
October 04, 2012

Zooniverse: Web-Scale Citizen Science

Web-scale citizen science such as Zooniverse (www.zooniverse.org) has provided a temporary solution to the flood of data that confronts researchers of 21st century, however the solution is a short-term one. In this presentation I will outline a potential strategy for combining a large web community and significant compute resources to create a scalable, intelligent classification engine.

03e2e7de45b193cac192ae7ea071e5ff?s=128

Arfon Smith

October 04, 2012
Tweet

Transcript

  1. Zooniverse: Web-Scale Citizen Science Arfon Smith @arfon

  2. None
  3. 1,000,000,000,000 1 trillion hours

  4. Spectrum of cognitive surplus

  5. None
  6. http://www.novacelestia.com

  7. None
  8. None
  9. None
  10. None
  11. None
  12. 0 20 40 60 80 100 Professor

  13. 0 2,500 5,000 7,500 10,000 Professor Paper

  14. 0 12,500 25,000 37,500 50,000 Professor Paper PhD

  15. 0 250,000 500,000 750,000 1,000,000 Professor Paper PhD SDSS

  16. None
  17. 680,000 volunteers

  18. 250,000,000 analyses

  19. None
  20. None
  21. None
  22. None
  23. None
  24. (it works)

  25. None
  26. None
  27. None
  28. None
  29. None
  30. None
  31. None
  32. None
  33. None
  34. None
  35. None
  36. None
  37. None
  38. None
  39. 0 25,000,000 50,000,000 75,000,000 100,000,000 Professor Paper PhD SDSS ?

  40. None
  41. Galaxy Zoo Supernovae

  42. Supernova (3) Reference New Difference

  43. Possible Supernova (1) Reference New Difference

  44. Likely Junk (-1) Reference New Difference

  45. Smith et al. - arXiv:1011.2199

  46. Optimisations (potential)

  47. 1. Accuracy/experience of the classifier 2. Difficulty of the object

    being classified 3. Uncertainty of the object’s classification 4. Difficulty of the task 5. Understand classifier motivation 6. Understand current ‘state’ of the classifier
  48. Dynamic Bayesian Combination of Multiple Imperfect Classifiers Simpson et al.

    - arXiv:1206.1831
  49. None
  50. None
  51. Classification strategies

  52. None
  53. None
  54. Combining Human and Machine Intelligence in Large-scale Crowdsourcing Kamar, Hacker

    & Horvitz - http://is.gd/horvitz
  55. None
  56. Motivations

  57. None
  58. None
  59. None
  60. None
  61. None
  62. Reference New Difference

  63. What are humans good at?

  64. None
  65. SDSS HST Starforming pea Narrow-line Seyfert pea

  66. None
  67. None
  68. None
  69. Social Computing Systems

  70. Expensive Noisy Classifier agents

  71. None
  72. Thanks arfon@zooniverse.org @arfon