Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Data Lorax: Planting The Seeds of Fairness in Data Products

OmaymaS
November 07, 2018

The Data Lorax: Planting The Seeds of Fairness in Data Products

Invited talk at #DatafestTbilisi2018.
https://datafest.ge/

OmaymaS

November 07, 2018
Tweet

More Decks by OmaymaS

Other Decks in Technology

Transcript

  1. THE DATA LORAX PLANTING THE SEEDS OF FAIRNESS IN DATA

    PRODUCTS OMAYMA SAID DATA SCIENTIST
  2. Data is the new oil, in the way that oil

    is a ubiquitous commodity that requires incredible resource allocation to extract value from, deep expertise to manage – and even when all that goes well – can have universally consequential negative externalities.* “ Drew Conway Founder & CEO
  3. SHIRLEY CARDS A mixed-color photos by Walt Jabsco, 1960s &

    1970s PRODUCT FAILED DUE TO SOMETHING INDIVIDUALS CAN’T CHANGE ABOUT THEMSELVES!
  4. No Classification without Representation Assessing Geodiversity Issues in Open Data

    Sets for the Developing World* Shreya Shankar, Yoni Halpern, Eric Breck, James Atwood, Jimbo Wilson, D. Sculley Google Brain Team Open Images ImageNet US US
  5. WEDDING PHOTOS Photos of bridegrooms from different countries aligned by

    the log-likelihood that the classifier trained on Open Images assigns to the bridegroom class (Source) BETTER AND MORE CONSISTENT CLASSIFICATION
  6. The WEIRDest people in the world? Joseph Henrich, Steven J.

    Heine, Ara Norenzayan University of British Columbia*
  7. INCLUSIVE IMAGE COMPETITION Wedding photographs (donated by Googlers), labeled by

    a classifier trained on the Open Images dataset. Source: Introducing The Inclusive Images Competition
  8. Amazon’s system TAUGHT ITSELF that male candidates were preferable. It

    penalized resumes that included the word “women’s,” as in “women’s chess club captain.” And it downgraded graduates of two all-women’s colleges, according to people familiar with the matter. They did not specify the names of the schools. “
  9. Amazon’s system TAUGHT ITSELF that male candidates were preferable. It

    penalized resumes that included the word “women’s,” as in “women’s chess club captain.” And it downgraded graduates of two all-women’s colleges, according to people familiar with the matter. They did not specify the names of the schools. “ LEARNED FROM HUMANS
  10. I worry all the time about building things and not

    having the foresight coz I'm just flawed and imperfect as everybody else, to know the consequences of what i am doing, and hurting ppl who can't bear the cost nearly as well as I can do. “ Josh Wills Software Engineer (Former Director of Data Engineering) I Build The Black Box: Grappling with Product and Policy
  11. COLLECT/LABEL DATA BUILD DATA PRODUCTS DEFINE METRICS REMEMBER THAT IT

    IS PEOPLE WHO BIAS IN: - REPRESENTATION - DISTRIBUTION - LABELS AND MORE…..
  12. COLLECT/LABEL DATA BUILD DATA PRODUCTS DEFINE METRICS REMEMBER THAT IT

    IS PEOPLE WHO - TRAIN/TEST SPLIT - FEATURES/PROXIES - COMPLEX MODELS INTERPRETABILITY AND MORE…..
  13. - WHAT IS THE IMPACT OF DIFFERENT ERROR TYPES ON

    DIFFERENT GROUPS? BUILD DATA PRODUCTS DEFINE METRICS REMEMBER THAT IT IS PEOPLE WHO - WHAT DO YOU OPTIMIZE FOR? COLLECT/LABEL DATA
  14. What we're still missing is an understanding for how to

    put ethics into practice in data as well as the overall product development process. “ ” DJ Patil Hilary Mason GM of Machine Learning Mike Loukides Vice President, Content Strategy
  15. UNLESS SOMEONE LIKE YOU CARES A WHOLE AWFUL LOT, NOTHING

    IS GOING TO GET BETTER, IT’S NOT! “ ”
  16. THE DATA LORAX PLANTING THE SEEDS OF FAIRNESS IN DATA

    PRODUCTS OMAYMA SAID DATA SCIENTIST