Human Powered Rails: Automated Crowdsourcing In Your Rails App

Human Powered Rails: Automated Crowdsourcing In Your Rails App

RailsConf 2018 talk on integrating Amazon Mechanical Turk in a Ruby on Rails Application

99cd54595fe036c175364bcbf617763c?s=128

Andy Glass

April 18, 2018
Tweet

Transcript

  1. Human Powered Rails Automated Crowdsourcing In Your Rails App Andy

    Glass RailsConf 2018 Unusual Rails Apps
  2. Andy Glass | RailsConf 2018 | Human Powered Rails Why

    am I talking about mTurk?
  3. Andy Glass | RailsConf 2018 | Human Powered Rails But

    first: Thanks, Rails
  4. Andy Glass | RailsConf 2018 | Human Powered Rails So,

    I owe you unusual…
  5. Andy Glass | RailsConf 2018 | Human Powered Rails This

    is not a talk about Mechanical Turk
  6. Andy Glass | RailsConf 2018 | Human Powered Rails This

    is not a talk about Mechanical Turk (yet)
  7. Andy Glass | RailsConf 2018 | Human Powered Rails This

    is not a talk about Crowdsourcing
  8. Andy Glass | RailsConf 2018 | Human Powered Rails This

    is a talk about being an imposter
  9. Andy Glass | RailsConf 2018 | Human Powered Rails Imposter

    Syndrome A concept describing individuals who are marked by an inability to internalize their accomplishments and have a persistent fear of being exposed as a "fraud" Pauline R. Clance and Suzanne A. Imes; 1978
  10. Andy Glass | RailsConf 2018 | Human Powered Rails We’re

    in good company… Maya Angelou Tom Hanks Neil Gaiman Sheryl Sandberg Sonia Sotomayor
  11. Andy Glass | RailsConf 2018 | Human Powered Rails Is

    Imposter Syndrome bad?
  12. Andy Glass | RailsConf 2018 | Human Powered Rails Is

    Imposter Syndrome good?
  13. Andy Glass | RailsConf 2018 | Human Powered Rails Is

    Imposter Syndrome good? Yeah.
  14. Andy Glass | RailsConf 2018 | Human Powered Rails Why

    Imposter Syndrome is Good for you Shane Ferro Huffington Post January 15, 2016
  15. Andy Glass | RailsConf 2018 | Human Powered Rails If

    you are interested in personal growth and development, by definition you are always going to be pushing yourself into something which is new. And when things are new, of course we don’t feel as comfortable in our skin as when we are doing something which is deeply familiar to us, and which we’ve been doing for five or 10 years. Caroline Webb
  16. Andy Glass | RailsConf 2018 | Human Powered Rails Imposter

    Syndrome Experience
  17. Andy Glass | RailsConf 2018 | Human Powered Rails This

    is not a talk about Mechanical Turk
  18. Andy Glass | RailsConf 2018 | Human Powered Rails This

    is a talk about learning to fake shit
  19. Andy Glass | RailsConf 2018 | Human Powered Rails Amazon

    Mechanical Turk is about faking shit
  20. Andy Glass | RailsConf 2018 | Human Powered Rails Amazon

    Mechanical Turk is about faking shit (ok, now lets talk about turk)
  21. Andy Glass | RailsConf 2018 | Human Powered Rails How

    did Mechanical Turk Start? Internal Amazon tool for categorizing products & identifying duplicates Publicly launched as a AWS product in 2005 Marketplace for online micro-jobs
  22. Andy Glass | RailsConf 2018 | Human Powered Rails The

    Turk
  23. Andy Glass | RailsConf 2018 | Human Powered Rails I

    told you it was about faking shit…
  24. Andy Glass | RailsConf 2018 | Human Powered Rails What

    can we use mTurk for? • Image/Video Processing • Data verification & Clean up • Information Gathering • Data processing
  25. Andy Glass | RailsConf 2018 | Human Powered Rails Lets

    make some (as a turk worker)
  26. Andy Glass | RailsConf 2018 | Human Powered Rails ~1,500

    Groups Of HITS (Questions) ~300,000 HITS (Assignments) ~15,000 Most # of HITS In A Single HIT Group $0.01 Lowest HIT Reward $150 Highest HIT Reward (2+ Hours Audio Transcription)
  27. Andy Glass | RailsConf 2018 | Human Powered Rails

  28. Andy Glass | RailsConf 2018 | Human Powered Rails

  29. Andy Glass | RailsConf 2018 | Human Powered Rails

  30. Andy Glass | RailsConf 2018 | Human Powered Rails

  31. Andy Glass | RailsConf 2018 | Human Powered Rails What

    should we use mTurk for today?
  32. Andy Glass | RailsConf 2018 | Human Powered Rails My

    requirements for this talk ✓ Not simple for Artificial Intelligence* ✓ Social Media Content ✓ Pittsburgh-centric ✓ Make attendees excited for dinner/happy hour *Yes- AI can do a lot
  33. Andy Glass | RailsConf 2018 | Human Powered Rails Does

    this sandwich have french fries in it?
  34. Andy Glass | RailsConf 2018 | Human Powered Rails Manual

    mTurk process ✓ Assemble sample data (IG Posts) ✓ Create new mTurk project ✓ Load mTurk batch ✓ Review Results
  35. Andy Glass | RailsConf 2018 | Human Powered Rails

  36. Andy Glass | RailsConf 2018 | Human Powered Rails

  37. Andy Glass | RailsConf 2018 | Human Powered Rails

  38. Andy Glass | RailsConf 2018 | Human Powered Rails

  39. Andy Glass | RailsConf 2018 | Human Powered Rails

  40. Andy Glass | RailsConf 2018 | Human Powered Rails

  41. Andy Glass | RailsConf 2018 | Human Powered Rails

  42. Andy Glass | RailsConf 2018 | Human Powered Rails

  43. Andy Glass | RailsConf 2018 | Human Powered Rails **plays

    Cardi B album for 23 minutes**
  44. Andy Glass | RailsConf 2018 | Human Powered Rails

  45. Andy Glass | RailsConf 2018 | Human Powered Rails

  46. Andy Glass | RailsConf 2018 | Human Powered Rails 4

    seconds Min time 702 seconds Max time 42 seconds Median Time 239 seconds Avg Time
  47. Andy Glass | RailsConf 2018 | Human Powered Rails 1

    Min assignments/worker 10 Max assignments/worker 3 Avg assignments/worker
  48. Andy Glass | RailsConf 2018 | Human Powered Rails 100%

    Consensus (both workers agreed) 95.2% Accuracy (20/21 sandwiches)
  49. Andy Glass | RailsConf 2018 | Human Powered Rails Correctly

    identified: fries
  50. Andy Glass | RailsConf 2018 | Human Powered Rails Correctly

    identified: no fries
  51. Andy Glass | RailsConf 2018 | Human Powered Rails The

    edge-case
  52. Andy Glass | RailsConf 2018 | Human Powered Rails The

    edge-case
  53. Andy Glass | RailsConf 2018 | Human Powered Rails The

    edge-case
  54. Andy Glass | RailsConf 2018 | Human Powered Rails There

    are tips + tricks to getting accurate results
  55. Andy Glass | RailsConf 2018 | Human Powered Rails We

    aren’t here to use .csv
  56. Andy Glass | RailsConf 2018 | Human Powered Rails Lets

    Automate It
  57. Andy Glass | RailsConf 2018 | Human Powered Rails Automation

    ✓ Create mTurk RoR service ✓ Create processes for loading tasks ✓ Create processes for approving results/re-inputting tasks ✓ Serve results via API
  58. Andy Glass | RailsConf 2018 | Human Powered Rails Gems

    I used Turkee by Jim Jones (aantix) Built on top of RTurk; DB Models; Easily converts forms; Launch/Import tasks https://github.com/aantix/turkee on top of: 
 RTurk by Ryan Tate (ryantate) A simple wrapper and library for Amazon's Mechanical Turk https://github.com/ryantate/rturk
  59. Andy Glass | RailsConf 2018 | Human Powered Rails Batch

    name title description instructions output_field_name output_field_opts (hash) BatchItem batch_id post_id reward_in_cents status result confidence BatchItemResult batch_item_id result (hash)
  60. Andy Glass | RailsConf 2018 | Human Powered Rails Batch

    name title description instructions output_field_name output_field_opts (hash) BatchItem batch_id post_id reward_in_cents status result confidence BatchItemResult batch_item_id result (hash) batch = Batch.create( name: "Sandwich Wizard", title: "Look at a picture of a delicious sandwich and determine…”, description: "You You will be provided with an image of a sandwich. Please analyze…” instructions: "<p>Please look at the image of a sandwich provided and…”, output_field_name: "category", output_field_opts: { yes: "Yes- it has french fries inside", no: "No- it does not have french fries inside" } ) post_ids = [‘BJ8ohZEAjWQ’, ‘BOn4nJvFAf7’, ‘BQirlsNFAd5’… post_ids.each do |post_id| batch.batch_items.create(post_id: post_id, reward_in_cents: 15) end
  61. rails g turkee 1. Creates config file (fill with AWS

    credentials) 2. Creates 2 DB models
  62. Andy Glass | RailsConf 2018 | Human Powered Rails Batch

    name title description instructions output_field_name output_field_opts (hash) BatchItem batch_id post_id reward_in_cents status result confidence BatchItemResult batch_item_id result (hash) Turkee::TurkeeTask Turkee::TurkeeImportedAssignment turkee_task_id assignment_id (from mturk) worker_id (from mturk) result_id hit_url (from mturk) task_type hit_id complete (boolean) number_completed_assignments (integer)
  63. batch.batch_items.each(&:launch)

  64. Andy Glass | RailsConf 2018 | Human Powered Rails host

    = “https://sandwichspotter.herokuapp.com” hit_title = batch.title hit_description = batch.description model_name = :BatchItemResult num_assignments = 2 reward = batch_item.price_in_cents/100.0 hit_lifetime = 5 #days duration = 1 #hours qualifications = {approval_rate: {gt: 95}} params = {} opts = {form_url: "https://sandwichspotter.herokuapp.com/ batches/#{batch.id}/batch_items/#{id}/batch_item_results/new"}
  65. Andy Glass | RailsConf 2018 | Human Powered Rails host

    = “https://sandwichspotter.herokuapp.com” hit_title = batch.title hit_description = batch.description model_name = :BatchItemResult num_assignments = 2 reward = batch_item.price_in_cents/100.0 hit_lifetime = 5 #days duration = 1 #hours qualifications = {approval_rate: {gt: 95}} params = {} opts = {form_url: "https://sandwichspotter.herokuapp.com/ batches/#{batch.id}/batch_items/#{id}/batch_item_results/new"}
  66. Andy Glass | RailsConf 2018 | Human Powered Rails host

    = “https://sandwichspotter.herokuapp.com” hit_title = batch.title hit_description = batch.description model_name = :BatchItemResult num_assignments = 2 reward = batch_item.price_in_cents/100.0 hit_lifetime = 5 #days duration = 1 #hours qualifications = {approval_rate: {gt: 95}} params = {} opts = {form_url: "https://sandwichspotter.herokuapp.com/ batches/#{batch.id}/batch_items/#{id}/batch_item_results/new"}
  67. Andy Glass | RailsConf 2018 | Human Powered Rails host

    = “https://sandwichspotter.herokuapp.com” hit_title = batch.title hit_description = batch.description model_name = :BatchItemResult num_assignments = 2 reward = batch_item.price_in_cents/100.0 hit_lifetime = 5 #days duration = 1 #hours qualifications = {approval_rate: {gt: 95}} params = {} opts = {form_url: "https://sandwichspotter.herokuapp.com/ batches/#{batch.id}/batch_items/#{id}/batch_item_results/new"}
  68. Andy Glass | RailsConf 2018 | Human Powered Rails host

    = “https://sandwichspotter.herokuapp.com” hit_title = batch.title hit_description = batch.description model_name = :BatchItemResult num_assignments = 2 reward = batch_item.price_in_cents/100.0 hit_lifetime = 5 #days duration = 1 #hours qualifications = {approval_rate: {gt: 95}} params = {} opts = {form_url: "https://sandwichspotter.herokuapp.com/ batches/#{batch.id}/batch_items/#{id}/batch_item_results/new"}
  69. Andy Glass | RailsConf 2018 | Human Powered Rails Turkee::TurkeeTask.create_hit(host,

    hit_title, hit_description, model_name, num_assignments, reward, hit_lifetime, duration, qualifications, params, opts)
  70. Andy Glass | RailsConf 2018 | Human Powered Rails batch_item_results/new.html.erb

    <h1><%= @batch.title %></h1> <h3><%= @batch.description %></h3> <h3><%= @batch.instructions.html_safe %></h3> <%= turkee_form_for @batch_item_result, params %> <%= @batch.output_field_opts.each do |label, instruction| %> <p><%= instruction %></p> <input name=“result[<%= @batch.output_field_name %>]” type=“radio" value="<%= label %>" /> <% end %> <%= f.submit “submit" %> <% end %>
  71. Andy Glass | RailsConf 2018 | Human Powered Rails batch_item_results/new.html.erb

    <h1><%= @batch.title %></h1> <h3><%= @batch.description %></h3> <h3><%= @batch.instructions.html_safe %></h3> <%= turkee_form_for @batch_item_result, params %> <%= @batch.output_field_opts.each do |label, instruction| %> <p><%= instruction %></p> <input name=“result[<%= @batch.output_field_name %>]” type=“radio" value="<%= label %>" /> <% end %> <%= f.submit “submit" %> <% end %> -Sets form url to https://www.mturk.com/mturk/externalSubmit -Adds hidden fields for capture (worker id, assignment ids, etc) Lets your form mimic a regular rails form
  72. Andy Glass | RailsConf 2018 | Human Powered Rails Importing

    result data
  73. Andy Glass | RailsConf 2018 | Human Powered Rails Importing

    result data Turkee::TurkeeTask.process_hits ✓ Creates Turkee::TurkeeImportedAssignments records ✓ Creates BatchItemResponse records
  74. Andy Glass | RailsConf 2018 | Human Powered Rails Importing

    result data #<Turkee::TurkeeImportedAssignment id: 111, assignment_id: “36U2A8VAG27ZEFQZ3NAC8G6ZFE9YKF", turkee_task_id: 6288, worker_id: “A2ARFO21LWE88K", result_id: 32 > #<BatchItemResult id: 32, batch_item: 2 result: {‘category’=>’No’} >
  75. Andy Glass | RailsConf 2018 | Human Powered Rails Batch

    name title description instructions output_field_name output_field_opts (hash) BatchItem batch_id post_id reward_in_cents status result confidence BatchItemResult batch_item_id result (hash) Turkee::TurkeeTask Turkee::TurkeeImportedAssignment turkee_task_id assignment_id (from mturk) worker_id (from mturk) result_id hit_url (from mturk) task_type hit_id complete (boolean) number_completed_assignments (integer)
  76. Andy Glass | RailsConf 2018 | Human Powered Rails Batch

    name title description instructions output_field_name output_field_opts (hash) BatchItem batch_id post_id reward_in_cents status result confidence BatchItemResult batch_item_id result (hash) Turkee::TurkeeTask Turkee::TurkeeImportedAssignment turkee_task_id assignment_id (from mturk) worker_id (from mturk) result_id hit_url (from mturk) task_type hit_id complete (boolean) number_completed_assignments (integer)
  77. Andy Glass | RailsConf 2018 | Human Powered Rails We

    still need to process and confirm the data
  78. Andy Glass | RailsConf 2018 | Human Powered Rails Batch

    Item (With results completed) Adjudicator Batch Item Complete Reprocess In Turk " #
  79. Andy Glass | RailsConf 2018 | Human Powered Rails def

    process if Adjudicator.new(self).approved? update_attributes(status: “complete”, result: adjudicator.result, confidence: adjudicator.confidence) else queue_for_reprocessing end end end
  80. Andy Glass | RailsConf 2018 | Human Powered Rails def

    process if Adjudicator.new(self).approved? update_attributes(status: “complete”, result: adjudicator.result, confidence: adjudicator.confidence) else queue_for_reprocessing end end end Send back to Turk for two more opinions
  81. Andy Glass | RailsConf 2018 | Human Powered Rails class

    Adjudicator def initialize(batch_item) @results_array = batch_item.bundle_results #flat array results accumulated from the BatchItem’s BatchItemResponse records (e.g. [‘yes’, ‘no’] @best_response_with_frequency = best_response_with_frequency end def confidence @best_response_with_frequency[1] == 1 ? “high” : “low” end def approved? (@best_response_with_frequency[1] / @results_array.count).to_f >.5 end def result @best_response_with_frequency[0] unless !approved? end private def best_response_with_frequency histo = result_array.to_histogram.sort{|resp, freq| -freq} histo[0] end end
  82. Andy Glass | RailsConf 2018 | Human Powered Rails def

    process if Adjudicator.new(self).approved? update_attributes(status: “complete”, result: adjudicator.result, confidence: adjudicator.confidence) else queue_for_reprocessing end end end
  83. Andy Glass | RailsConf 2018 | Human Powered Rails task

    :import_and_process do Turkee::TurkeeTask.process_hits batch_items_to_process = BatchItem.incomplete.all_results_in batch_items_to_process.each(&:process) end
  84. Andy Glass | RailsConf 2018 | Human Powered Rails rake

    import_and_process
  85. Andy Glass | RailsConf 2018 | Human Powered Rails rake

    import_and_process (repeat)
  86. Andy Glass | RailsConf 2018 | Human Powered Rails Serve

    completed batch_items via API
  87. Andy Glass | RailsConf 2018 | Human Powered Rails Its

  88. Andy Glass | RailsConf 2018 | Human Powered Rails Its

    lit
  89. Andy Glass | RailsConf 2018 | Human Powered Rails Its

    lit AF
  90. Andy Glass | RailsConf 2018 | Human Powered Rails Its

    lit AF
  91. Andy Glass | RailsConf 2018 | Human Powered Rails Basic

    Use Case (sandwich spotter) Extensible (any use case)
  92. Andy Glass | RailsConf 2018 | Human Powered Rails How

    could we extend the app? ✓ Multiple batch items in a TurkeeTask ✓ Complex reprocessing flow ✓ Multiple inputs/outputs for a batch item ✓ Different output types
  93. Andy Glass | RailsConf 2018 | Human Powered Rails How

    should we extend the app? ✓ Multiple batch items in a TurkeeTask ✓ Complex reprocessing flow ✓ Multiple inputs/outputs for a batch item ✓ Different output types
  94. Andy Glass | RailsConf 2018 | Human Powered Rails Multiple

    Inputs/Outputs BatchItem batch_id post_id reward_in_cents status result confidence Batch name title description instructions output_field_name output_field_opts (json)
  95. Andy Glass | RailsConf 2018 | Human Powered Rails Multiple

    Inputs/Outputs BatchItem batch_id post_id reward_in_cents status result confidence Batch name title description instructions output_field_name output_field_opts (json)
  96. Andy Glass | RailsConf 2018 | Human Powered Rails BatchItem

    batch_id post_id reward_in_cents status result confidence input_data (json) results (json) confidence_levels (json) Batch name title description instructions output_field_name output_field_opts (hash) BatchInput batch_id key format settings (json) BatchOutput batch_id key format display_settings (json) adjudicator adjudicator_settings (json)
  97. Andy Glass | RailsConf 2018 | Human Powered Rails batch

    = Batch.create( name: "Sandwich Wizard", ) batch.batch_inputs.create( key: “sandwich_ig”, format: “instagram_post” )
  98. Andy Glass | RailsConf 2018 | Human Powered Rails batch

    = Batch.create( name: "Sandwich Wizard", ) batch.batch_inputs.create( key: “sandwich_ig”, format: “instagram_post” ) batch.batch_items.create( input_data:{ sandwich_ig: “BJ8ohZEAjWQ” } )
  99. Andy Glass | RailsConf 2018 | Human Powered Rails batch.batch_outputs.create(

    key: “has_fries”, format: “categories”, display_settings:{ category_opts: { yes: "Yes- it has french fries inside", no: "No- it does not have french fries inside” } }, adjudicator: ‘mode’, adjudicator_settings:{ acceptance: .5, } )
  100. Andy Glass | RailsConf 2018 | Human Powered Rails batch.batch_outputs.create(

    key: “has_fries”, format: “categories”, display_settings:{ category_opts: { yes: "Yes- it has french fries inside", no: "No- it does not have french fries inside” } }, adjudicator: ‘mode’, adjudicator_settings:{ acceptance: .5, } ) How we display the output How we confirm the output data
  101. Andy Glass | RailsConf 2018 | Human Powered Rails batch.batch_outputs.create(

    key: “fry_count”, format: “counter”, display_settings:{ instruct: “if yes, how many fries”, category_opts: { min: 0, max: 50 } } adjudicator: ‘number’, adjudicator_settings:{ variance: .20, } )
  102. Andy Glass | RailsConf 2018 | Human Powered Rails batch.batch_outputs.create(

    key: “fry_count”, format: “counter”, display_settings:{ instruct: “if yes, how many fries”, category_opts: { min: 0, max: 50 } } adjudicator: ‘number’, adjudicator_settings:{ variance: .20, } ) How we display the output How we confirm the output data
  103. Andy Glass | RailsConf 2018 | Human Powered Rails Possible

    input formats ✓ Business listing information ✓ Image ✓ Social posts (tweets, grams) ✓ Twilio embed (phone calls) ✓ Video (with JS helpers for splicing, etc)
  104. Andy Glass | RailsConf 2018 | Human Powered Rails Possible

    output formats ✓ Text (phone, email, website, etc) ✓ Radio buttons ✓ Multi-select categories ✓ Number ✓ Video data
  105. Andy Glass | RailsConf 2018 | Human Powered Rails Adjudicator

    Types ✓ Single text output ✓ Multiple text output ✓ Email ✓ URL ✓ Number
  106. Andy Glass | RailsConf 2018 | Human Powered Rails Tips

    for Turk Accuracy ✓ Explicit instructions ✓ UX ✓ Simple and straightforward tasks ✓ “Gold” Data ✓ Approved worker pools/worker qualifications
  107. Andy Glass | RailsConf 2018 | Human Powered Rails Task

    Speed Rate = e * r * n Ease of Task reward number of tasks
  108. Andy Glass | RailsConf 2018 | Human Powered Rails Task

    Speed Rate = e * r * n Ease of Task reward number of tasks Andy’s Theorem™
  109. Andy Glass | RailsConf 2018 | Human Powered Rails Ethics

    of mTurk Amazon Mechanical Turk: Gold Mine or Coal Mine? University of Colorado 2011 Amazon's Mechanical Turk workers protest: 'I am a human being, not an algorithm’ The Guardian December 3 2014
  110. Andy Glass | RailsConf 2018 | Human Powered Rails Culture

    http://turkernation.com/ https://turkopticon.ucsd.edu/ /r/mturk “Turking isn't beer money for me, it's rent money and food money” “12 hours a day on here to make an average of $100 a day” “A higher-end worker on MTurk can expect to make $6-$12 per hour”
  111. Andy Glass | RailsConf 2018 | Human Powered Rails Cambridge

    Analytica + mTurk How Amazon Helped Cambridge Analytica Harvest Americans’ Facebook Data Fast Company March 27 2018
  112. Andy Glass | RailsConf 2018 | Human Powered Rails

  113. Andy Glass | RailsConf 2018 | Human Powered Rails

  114. Andy Glass | RailsConf 2018 | Human Powered Rails So,

    why did you listen to me talk about mTurk?
  115. Andy Glass | RailsConf 2018 | Human Powered Rails Thanks

    for listening to me talk about mTurk.