Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Now is better than Never: What the Zen of Pytho...

Now is better than Never: What the Zen of Python can teach us about Data Ethics

We Pythonistas welcome newcomers with the wisdom of Tim Peter's "import this". Okay, well maybe. The Zen of Python provides us as a community general aphorisms on how to write Python and how to be a good Pythonista by offering loose guidelines that promotes discussion. What lessons, then, can the Zen of Python teach us about Data Ethics?

Data ethics is a nebulous concept, a necessity in the era of algorithms and the data economy. Together we'll review some stories from the headlines about the data economy where there were ethical concerns and apply the Zen of Python. Starting with the impact of social media likes on political campaigns to censorship on social media in the #MeToo movement, we'll use big challenges to highlight obvious and not so obvious lessons.

Ultimately the Zen of Python teaches us that 'Now is Better than Never' and we must ask as data practitioners - what principles will we develop and champion to respond to ethical dilemmas?

Lorena Mesa

October 07, 2018
Tweet

More Decks by Lorena Mesa

Other Decks in Technology

Transcript

  1. Now Is Better Than Never: What the Zen of Python

    can teach us about Data Ethics Lorena Mesa @loooorenanicole SLIDES @ http://bit.ly/2De7N8u
  2. 2 HELLO! I’m Lorena Mesa. I am here because, well,

    I think this is a topic that is very important for us to consider as both technology users and technologists. You can find me at @loooorenanicole on most platforms (yes that is FOUR letter Os).
  3. 8

  4. 9 Some notes on how to approach today’s chat 1.

    Advice and musings offered here do not presume to be the answer to the data ethics challenge 2. We need to observe what’s happening in the world around us. And we’ll be discussing some heavy topics. Please check in with yourself and step outside if you need to. I fully understand. 3. Ultimately, how can we move the needle in the right direction?
  5. “ The Zen of Python, by Tim Peters Beautiful is

    better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested. Sparse is better than dense. Readability counts. Special cases aren't special enough to break the rules. Although practicality beats purity. Errors should never pass silently. Unless explicitly silenced. In the face of ambiguity, refuse the temptation to guess. There should be one-- and preferably only one --obvious way to do it. Although that way may not be obvious at first unless you're Dutch. Now is better than never. Although never is often better than *right* now. If the implementation is hard to explain, it's a bad idea. If the implementation is easy to explain, it may be a good idea. Namespaces are one honking great idea -- let's do more of those! 10
  6. 14 What is data ethics? - The language of right

    or wrong? - Our rights and responsibilities? - Something else?
  7. 15 What is data ethics? - The ethics of data

    (how data is generated, recorded and shared) - The ethics of algorithms (how artificial intelligence, machine learning and robots interpret data) - The ethics of practices (devising responsible innovation and professional codes to guide this emerging science) - What is Data Ethics?, Philosophical Transactions, Luciano Floridi and Mariarosaria Taddeo
  8. “ “Big Data processes codify the past. They do not

    invent the future. Doing that requires moral imagination, and that’s something only humans can provide. We have to explicitly embed better values into our algorithms, creating Big Data models that follow our ethical lead. Sometimes that will mean putting fairness ahead of profit” Cathy O'Neil, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy 16
  9. “ “[Algorithms have] the power to enable and assign meaningfulness,

    managing how information is perceived by users, the ‘distribution of the sensible.’” - Langlois, Ganaele. "Participatory Culture and the New Governance of Communication The Paradox of Participatory Media." Television & New Media 14.2 (2013): 91-105. 17
  10. 19 “Were it not for the Internet, Barack Obama would

    not be president. Were it not for the Internet, Barack Obama would not have been the nominee,” said Arianna Huffington, editor in chief of The Huffington Post.” https://bits.blogs.nytimes.com/2008/11/07/how-obamas-i nternet-campaign-changed-politics/ How? ▪ Use of social media (e.g. YouTube) ▪ GOTV drives informed by data science ▪ Customized “_______ for Obama” interest groups SLIDES @ http://bit.ly/2De7N8u
  11. SPECIAL CASES AREN’T SPECIAL ENOUGH TO BREAK THE RULES. Case

    Study #1: Censorship and Cyberbullying in Social Media
  12. "We have been in touch with Ms. McGowan's team," Twitter

    said in a tweet on Thursday. "We want to explain that her account was temporarily locked because one of her Tweets included a private phone number, which violates of our Terms of Service." Source: CNN SLIDES @ http://bit.ly/2De7N8u
  13. “ “The privileged, we’ll see time and again, are processed

    more by people, the masses by machines.” ― Cathy O'Neil, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy 27
  14. 30 “[Cyberbullying is] . . . the use of information

    and communication technologies to support deliberate, repeated, and hostile behaviour by an individual or group, that is intended to harm others” Belsey, B. Cyberbullying.ca. http://www.cyberbullying.ca SLIDES @ http://bit.ly/2De7N8u
  15. 31 How has cyberbullying been deterred in the past? Historically

    done via: - Content moderation via the product owner - Content moderation via user feedback (e.g. user reports) Typically these approaches require having moderators manually review comments. This is gravely inefficient and doesn’t scale well due to the need for human input.
  16. 32 No one human or computer can sift through all

    social media and online communication. When we can no longer find enough humans to intervene, what do we risk?
  17. 37 “In early 2017, Chicago Mayor Rahm Emanuel announced a

    new initiative in the city’s ongoing battle with violent crime. The most common solutions to this sort of problem involve hiring more police officers or working more closely with community members. But Emanuel declared that CPD would expand its use of software, enabling what is called “predictive policing,” particularly in neighborhoods on the city’s south side.” https://theconversation.com/why-big-data-analysi s-of-police-activity-is-inherently-biased-72640 Predictive Policing in Chicago SLIDES @ http://bit.ly/2De7N8u
  18. 38 Predictive Policing in Chicago (continued) - Identify people who

    are expected to become victims or criminals - “Officers may even be assigned to visit the people to warn against committing a violent crime” - In Chicago, Illinois, an algorithm rates every person arrested with a numerical threat score from 1 to 500-plus.
  19. Amazon responded to Rekognition’s match rate indicating they recommend everyone

    to set their product at 95% confidence when matching (even though the default is set to 80%). More at - https://www.theguardian.com/technology/2 018/jul/26/amazon-facial-rekognition-congre ss-mugshots-aclu Other Predictive Policing Tools: APIs SLIDES @ http://bit.ly/2De7N8u
  20. “ About 126,000 rumors were spread by ∼ 3 million

    people. False news reached more people than the truth; the top 1% of false news cascades diffused to between 1000 and 100,000 people, whereas the truth rarely diffused to more than 1000 people. Falsehood also diffused faster than the truth. The degree of novelty and the emotional reactions of recipients may be responsible for the differences observed. - Source: The spread of true and false news online, Science, March 2018 by Soroush Vosoughi1, Deb Roy1, Sinan Aral2,* 45
  21. Prototype on ProPublica’s Dollars for Docs (2013-2015) dataset - data

    on payments made from pharmaceutical companies to doctors. Read white paper at: datanutrition.media.mit.edu
  22. Contains two visualizations for understanding and analyzing machine learning datasets:

    Facets Overview and Facets Dive. https://pair-code.github.io/facets/
  23. 53 Practices: How can we improve machine learning? 1. Explore

    and understand your data 2. Explore your errors 3. Make your results interpretable (e.g. Build things that people can understand!) 4. If you do not know something, find someone that does. - Deborah Hanus, PyCon Colombia 2018, Sparrow, Founder
  24. Evolving Algorithms: Using Fairness Criteria in lieu of Profit Maximization

    in Threshold Classifiers SLIDES @ http://bit.ly/2De7N8u
  25. 56 The Ethics of Practice: How you can make your

    practices more intentional and deliberate.
  26. Evolving Practices: Institute Best Practices Within Reason for your Organization

    Mozilla’s Data Collection Policy 1. How does the company use Big Data, and to what extent is it integrated into strategic planning? 2. Does the organisation send a privacy notice when personal data are collected? 3. Does my organisation assess the risks linked to the specific type of data my organisation uses? 4. Does my organisation have safeguards in place to mitigate these risks? 5. Do we make sure that the tools to manage these risks are effective and measure outcomes? 6. Do we conduct appropriate due diligence when sharing or acquiring data from third parties? Source: 6 Ethical Questions about Big Data https://www.fm-magazine.com/news/2016/jun/ethical-question s-about-big-data.html SLIDES @ http://bit.ly/2De7N8u
  27. 59 SLIDES @ http://bit.ly/2De7N8u Evolving Practices: What might a Hippocratic

    Oath for a Data Scientist / technologist / coder / etc. look like?
  28. 62 Now, it’s your turn. What is the impact you

    want to make as a technologist?
  29. 63 THANK YOU! If you want to continue the conversation,

    let’s talk! Also, some of my images come from #WOCinTech! @loooorenanicole [email protected]