Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Paul Verbeek-Mast (Booking.com), Bad Evidence ...

CodeFest
February 01, 2018

Paul Verbeek-Mast (Booking.com), Bad Evidence in Testing Product Hypothesis, CodeFest 2017

https://2017.codefest.ru/lecture/1181

Criminal investigators have something called ‘bad evidence’ or confirmation bias. When they have a theory about a case, sometimes they tend to avoid evidence that goes against that theory. Unconsciously but also consciously.

This is a big problem we have with testing product hypothesis as well. We tend to ignore the data that go against our theory. And if we do have “bad data” we can’t go around, we test it a bit longer until this disappears.

Why do we do this and how do you deal with this? What other common pitfalls do we have? And is hypothesis testing really worth the time?

CodeFest

February 01, 2018
Tweet

More Decks by CodeFest

Other Decks in Technology

Transcript

  1. All A/B tests and data shown in this presentation are

    not based on real experiments. They are made up just for this presentation.
  2. How much do you want to create “Bad Evidence”? Насколько

    вы готовы получить доказательство обратного ?
  3. You don’t want to do something if it is going

    to go against your theory of the case. Вы не хотите делать что-то что повредит вашей теории
  4. Rather than trying to get to the truth, what you’re

    trying to do is build your case, and make it the strongest case possible. Вместо того чтобы докопаться до истины вы пытаетесь защитить свою версию, сделав ее доказательство "пуленепробиваемым".
  5. What does verification bias cause you to do? Ignore it

    and push it to the side. Что вы будете делать со своей предвзятостью ? Просто игнорируйте ее.
  6. Base Variant Making the search box hotpink will result in

    more searches база вариант
  7. Making the search box hotpink will result in more searches

    6252 searches +19.45% 242 bookings -4.7%
  8. Why • Based on a gut feeling, I believe (…)

    • Because I like it better, I believe (…) • Because I saw it on another website, I believe (…) Bad examples Objective and based on data
  9. Why • Because of research described in article (…), we

    believe (…) • After done user research, we believe (…) • Based on a previous experiment doing (…), we believe (…) Objective and based on data Good examples
  10. What An accurate, short description of your change • we

    make it pink • we move it to a different place • we change the title Bad examples
  11. What • we make the search box on the homepage

    pink • we open pictures in the search page in a lightbox when clicking on it Good examples An accurate, short description of your change
  12. Who A realistic, accurate description of your target group •

    everyone • some people • users booking a hotel in Novosibirsk, named Paul, from Amsterdam, with a big beard Bad examples
  13. Who A realistic, accurate description of your target group •

    users visiting the home page • users searching for a property in Novosibirsk • users who are logged in Good examples
  14. Outcome measurable, expected changes • users feeling better • the

    site looking prettier • an increase in loyalty Bad examples
  15. Outcome • an increase in earnings • a decrease in

    returned products • an increase in sign-ups Good examples measurable expected metrics
  16. Because of user research we believe that changing the background

    of the search box to pink for (who) will result into (outcome)
  17. Because of user research we believe that changing the background

    of the search box on the homepage pink for users that visit the homepage will result into (outcome)
  18. Because of user research we believe that changing the background

    of the search box on the homepage pink for users that visit the homepage will result into an increase in bookings
  19. Because of user research we believe that changing the background

    of the search box on the homepage pink for users that visit the homepage will result into an increase in bookings
  20. Because of user research we believe that changing the background

    of the search box on the homepage pink for users that visit the homepage will result into an increase in bookings
  21. Because of studies done by x and y that show

    the positive effect of green, we believe that changing our booking buttons to green for users who visit our product page will result into more bookings
  22. Because of studies done by x and y that show

    the positive effect of green, we believe that changing our booking buttons to green for users who visit our product page will result into more bookings Base
  23. Because of studies done by x and y that show

    the positive effect of green, we believe that changing our booking buttons to green for users who visit our product page will result into more bookings Base Variant
  24. • Number of visitors • How big of a change

    you want to measure • How confident you want to be, that your test is correct How long should your run your A/B test?
  25. clicks on button hover over button bookings visits on page

    scrolled to button bookings from IE8 bookings from Malaysia users going to search results logins sign ups clicks on logo time on page returning visitors price of booking number of rooms booked language changes calls to customer service buys with credit card
  26. clicks on button hover over button bookings visits on page

    scrolled to button bookings from IE8 bookings from Malaysia users going to search results logins sign ups clicks on logo time on page returning visitors price of booking number of rooms booked language changes calls to customer service buys with credit card
  27. clicks on button hover over button bookings visits on page

    scrolled to button bookings from IE8 bookings from Malaysia users going to search results logins sign ups clicks on logo time on page returning visitors price of booking number of rooms booked language changes calls to customer service buys with credit card
  28. clicks on button hover over button bookings visits on page

    scrolled to button bookings from IE8 bookings from Malaysia users going to search results logins sign ups clicks on logo time on page returning visitors price of booking number of rooms booked language changes calls to customer service buys with credit card
  29. clicks on button hover over button bookings visits on page

    scrolled to button bookings from IE8 bookings from Malaysia users going to search results logins sign ups clicks on logo time on page returning visitors price of booking number of rooms booked language changes calls to customer service buys with credit card
  30. clicks on button hover over button bookings visits on page

    scrolled to button bookings from IE8 bookings from Malaysia users going to search results logins sign ups clicks on logo time on page returning visitors price of booking number of rooms booked language changes calls to customer service buys with credit card
  31. clicks on button hover over button bookings visits on page

    scrolled to button bookings from IE8 bookings from Malaysia users going to search results logins sign ups clicks on logo time on page returning visitors price of booking number of rooms booked language changes calls to customer service buys with credit card +0.1% -0.2% +2.3% +0.3% +4.7% -3.1% +0.0% +3.5% -1.1% -2.1% +0.3% +2.1% -1.8% -0.3% +0.0% +0.5% +4.3% -0.2%
  32. clicks on button hover over button bookings visits on page

    scrolled to button bookings from IE8 bookings from Malaysia users going to search results logins sign ups clicks on logo time on page returning visitors price of booking number of rooms booked language changes calls to customer service buys with credit card +0.1% -0.2% +2.3% +0.3% +4.7% -3.1% +0.0% +3.5% -1.1% -2.1% +0.3% +2.1% -1.8% -0.3% +0.0% +0.5% +4.3% -0.2%
  33. clicks on button hover over button bookings visits on page

    scrolled to button bookings from IE8 bookings from Malaysia users going to search results logins sign ups clicks on logo time on page returning visitors price of booking number of rooms booked language changes calls to customer service buys with credit card +0.1% -0.2% +2.3% +0.3% +4.7% -3.1% +0.0% +3.5% -1.1% -2.1% +0.3% +2.1% -1.8% -0.3% +0.0% +0.5% +4.3% -0.2%
  34. “price is going up, so it must be doing well”

    “price is going down, so it must be a false negative” vs. Metrics that are not in hypothesis
  35. “this new metric is positive, it’s working great!” “this new

    metric is negative, must be having a bug” vs. Newly implemented metrics
  36. “it’s positive after 5 days, let’s put it in production”

    “it’s negative after 5 days, let’s run it for another few days” vs. Sample size
  37. • Number of visitors • How big of a change

    you want to measure • How confident you want to be, that your test is correct How long should your run your A/B test?