Upgrade to Pro — share decks privately, control downloads, hide ads and more …




Scott Sievert

August 15, 2017


  1. NEXT: Crowdsourcing, machine learning and cartoons Scott Sievert LUCID trainee

    UW–Madison ECE
  2. Data collection with crowdsourcing can be expensive Problem

  3. comic by P. C. Vey Cardinal Bandits Select the street

    that looks safer Dueling Bandits Select face on the bottom most similar to the face on top Pool based triplets Crowdsourcing problems
  4. Existing crowdsourcing systems are passive Goal: adapt to previously collected

    responses Adapting to previous responses requires fewer data One solution
  5. Adaptive sampling can have large benefits NEXT Mechanical Turk Benefits

  6. unlabeled data Example of adaptive algorithm

  7. Kevin Jamieson Rob Nowak Lalit Jain Daniel Ross nextml.org Homepage:

    http://nextml.org Source: https://github.com/nextml/NEXT Documentation: https://github.com/nextml/NEXT/wiki
  8. UW Psychology uses NEXT to find the best algorithms for

    adaptive data collection in cognitive science. The New Yorker uses NEXT to crowd-source the weekly cartoon caption contest. Air Force Research Lab uses NEXT for active image classification. ML Researchers Experimentalists Practitioners Theory Practice NEXT users
  9. Bob Mankoff Comic by P. C. Vey The New Yorker

    has to find the funniest caption from ~5,000 captions Example problem
  10. http://www.newyorker.com/cartoons/vote http://nextml.org/captioncontest Comic by P. C. Vey Interface

  11. Histogram of responses Histogram of time responses received Experiment Info

    Data from contests: https://github.com/nextml/caption-contest-data Dashboard
  12. 0 not funny 1 somewhat funny 2 funny Data 4x

    times fewer ratings needed Experimentalist Benefits BR-lilUCB Random Software enhancements
  13. Crowdsourcing Adaptive sampling algorithms fewer responses more accurate models real-world

    data participant fatigue, label quality, algorithm delays Goal: enable this feedback loop Enabling this feedback loop requires software that is easy to use by both parties NEXT
  14. NEXT can also be used with REST API comic by

    P. C. Vey Cardinal Bandits Select the street that looks safer Dueling Bandits Select face on the bottom most similar to the face on top Pool based triplets By default, NEXT has adaptive algorithms for the 3 default question types Software uses
  15. More detail on documentation: https://github.com/nextml/NEXT/wiki 0. Web browser 1. Amazon

    AWS account 2. ZIP of targets (e.g., images) 3. Experiment description (which has good documentation!) Result requirements After NEXT link sent to crowdsourcing service, results can be generated!
  16. 1. Adaptive sampling reduces data collection cost. 2. NEXT is

    a crowdsourcing data collection tool that can use adaptive sampling techniques 3. NEXT is easy* to use by experimentalists, algorithm developers and practitioners, and a mathematical background is not required. 4. NEXT developers experimentalist engagement to aid research and to gain feedback to improve the software * NEXT has been created by an academic research group in collaboration with psychologists Key messages
  17. https://xkcd.com/1543/