Upgrade to Pro — share decks privately, control downloads, hide ads and more …

#22 - Transparence et protection de vie privée à l’ère du big-data

#22 - Transparence et protection de vie privée à l’ère du big-data

Description de la soirée du 25 Avril 2017 - Toulouse Data Science
-----------------------------------------------------------------------------------------
Ethique et algorithme, où est la limite??

Vous souvenez-vous du robot Tay de Microsoft ? Tay est un robot doté d’une intelligence artificielle et capable de l’accroître en conversant avec des internautes via des tweets. En Mars dernier, ses créateurs ont dû le désactiver au bout de quelques heures car il véhiculait des propos racistes, négationnistes ou encore sexistes. Tay était tombé sous l’influence d’interlocuteurs humains mal intentionnés qui lui apprenaient surtout la violence verbale ! Qui blâmer?? Les internautes qui ont provoqué ces dérives, le robot lui-même ou les développeurs de Tay qui n’avaient pas anticipé le risque??

Avez-vous entendu parler de ProPublica qui a recensé le cas d'un algorithme utilisé pour prédire la probabilité qu'un individu commette un crime dans le futur? En l'occurrence, l'algorithme en question a montré que les noirs étaient plus enclins au crime que les blancs après avoir utilisé des données extrêmement biaisées.

Bref, vous l’aurez compris les développeurs portent sur leurs épaules une lourde responsabilité.

Lors de cette soirée, François Royer a présenté les mécanismes permettant de mettre en place des applications de machine learning transparentes et auditables. (cf slides ci-dessus)

Ulrich Aïvodji : Doctorant en Informatique, Ulrich travaille au LAAS-CNRS au sein des équipes ROC(Recherche Opérationnelle, Optimisation Combinatoire et Contraintes) et TSF(Tolérance aux fautes et Sûreté de Fonctionnement). Ses travaux de recherche portent sur la protection de la vie privée dans les services de localisation. Vous trouverez ces slides ici : https://goo.gl/ROqIif

6aa4f3c589d3108830b371d0310bc4da?s=128

Toulouse Data Science

April 26, 2017
Tweet

Transcript

  1. Pour une approche transparente, auditable et interprétable. Apprentissage Machine et

    Intelligence Artificielle Toulouse Data Science Meetup 25.04.17 www.pwc.com
  2. PwC’s Digital Services Merci aux organisateurs. Des rencontres régulières à

    Toulouse avec des data scientists pour discuter du traitement des données volumineuses, de son utilisation en entreprise et au quotidien, des avantages compétitifs possibles et bien sûr des technologies pour l’analyse des données massives. Confidential information for the sole benefit and use of PwC’s client. 2 www.meetup.com/fr-FR/ Tlse-Data-Science www.levillagebyca.com
  3. PwC’s Digital Services Hello. 3 François Royer DIRECTEUR, DATA &

    ANALYTICS 06 43 41 20 85 francois.royer@fr.pwc.com @francoisroyer
  4. PwC’s Digital Services PwC, global leader in Audit and Technology

    Consulting 4 157 countries 776 offices people 195 000 Turnover in june 2016 34 Md$ 7 000 Data & Analytics experts incl. 200 in France 5000 people
  5. PwC’s Digital Services “Help us build the digital future you

    want to live in.” Confidential information for the sole benefit and use of PwC’s client. PwC’s Digital Services PwC’s Digital Services 5
  6. PwC’s Digital Services Les points clés La société et les

    citoyens/consommateurs demandent plus de transparence de la part de l’IA et ses applications. 6 PwC Consulting | Data & Analytics Le Deep Learning, les méthodes d’ensemble = sexy mais peu interprétable? Où chercher les méthodes adaptées à une approche maîtrisée des risques? (Indice: pas dans la publicité en ligne) Nous (Data Scientists, développeurs, experts, décideurs…) sommes responsables!
  7. PwC’s Digital Services 01 The context: our data-driven society Confidential

    information for the sole benefit and use of PwC’s client. Data about service-based business models Collect first, ask questions later What can possibly go wrong? 7
  8. 8 Perfect service What your customers want

  9. PwC’s Digital Services So it’s not about the data, but

    the insights and value-added services you provide. Behavioral data is your competitive advantage! Confidential information for the sole benefit and use of PwC’s client. PwC’s Digital Services 9
  10. PwC’s Digital Services Internet startup says: AND ALSO EVERYONE ELSE.

    Confidential information for the sole benefit and use of PwC’s client. 10
  11. PwC’s Digital Services 02 Serious case studies Confidential information for

    the sole benefit and use of PwC’s client. Decisions with impact Simple models are used everywhere Towards more transparency 11
  12. PwC’s Digital Services Beware of scoring algorithms! Are you advertising

    or advising? SCORING CAN BECOME DISCRIMINATORY QUICKLY. OR WRONG.
  13. PwC’s Digital Services Beware of scoring algorithms! Are you advertising

    or advising? SCORING CAN BECOME DISCRIMINATORY QUICKLY. OR WRONG.
  14. PwC’s Digital Services « Score Cards » of the US

    Sentencing Commission Confidential information for the sole benefit and use of PwC’s client. 14 http://www.ussc.gov/guidelines/2015-guidelines-manual/2015-chapter-4
  15. PwC’s Digital Services LAPD recruitment model or how do you

    hire 1000 cops? Confidential information for the sole benefit and use of PwC’s client. 15 « This simplicity gets at the important issue: A decent transparent model that is actually used will outperform a sophisticated system that predicts better but sits on a shelf. If the researchers had created a model that predicted well but was more complicated, the LAPD likely would have ignored it, thus defeating the whole purpose » -- Greg Ridgeway, Director of National Institute of Justice www.rand.org/pubs/research_briefs/RB9447/index1.html https://www.rand.org/pubs/monographs/MG881.html
  16. PwC’s Digital Services SKYNET and the drone war. Confidential information

    for the sole benefit and use of PwC’s client. 16 arstechnica.co.uk/security/2016/02/the-nsas-skynet-program-may-be-killing-thousands-of-innocent-people/
  17. PwC’s Digital Services The rows in your dataset are real

    people. MEET MR ZAIDAN, AL JAZEERA BUREAU CHIEF IN ISLAMABAD Confidential information for the sole benefit and use of PwC’s client. 17
  18. PwC’s Digital Services DARPA is funding XAI (Explainable Artificial Intelligence)

    Confidential information for the sole benefit and use of PwC’s client. 18 The Explainable AI (2016) program aims to create a suite of machine learning techniques that: • Produce more explainable models, while maintaining a high level of learning performance (prediction accuracy); • Enable human users to understand, appropriately trust, and effectively manage the emerging generation of artificially intelligent partners. http://www.darpa.mil/program/explainable-artificial-intelligence/
  19. PwC’s Digital Services Towards more transparency in algorithm-based decisions. Open

    Fisca, GDPR, APB source codes…
  20. PwC’s Digital Services 03 Some tools and techniques to facilitate

    interpretation and auditing. Confidential information for the sole benefit and use of PwC’s client. Explaining with simple proxies Grpahical models Probabilistic programming 20
  21. PwC’s Digital Services LIME - Local Interpretable Model-Agnostic Explanations Confidential

    information for the sole benefit and use of PwC’s client. 21 https://github.com/marcotcr/lime "Why Should I Trust You?": Explaining the Predictions of Any Classifier Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction, or when choosing whether to deploy a new model. Such understanding also provides insights into the model, which can be used to transform an untrustworthy model or prediction into a trustworthy one. In this work, we propose LIME, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally around the prediction.
  22. PwC’s Digital Services Remember factor graphs and Bayesian networks? Confidential

    information for the sole benefit and use of PwC’s client. 22 Add just the right amount of structure from the domain expert (you). Let the inference procedure find out the rest (unknown variables and parameters). Plug any required probability distribution. www.microsoft.com/en-us/research/project/trueskill-ranking-system/
  23. PwC’s Digital Services Probabilistic programming - or - Can we

    have high performance bottom-up models? PyMC Confidential information for the sole benefit and use of PwC’s client. 23 Edward Anglican Figaro Church Dimple
  24. PwC’s Digital Services Merci de votre attention Confidential information for

    the sole benefit and use of PwC’s client. © 2017 PwC. All rights reserved. Not for further distribution without the permission of PwC. “PwC” refers to the network of member firms of PricewaterhouseCoopers International Limited (PwCIL), or, as the context requires, individual member firms of the PwC network. Each member firm is a separate legal entity and does not act as agent of PwCIL or any other member firm. PwCIL does not provide any services to clients. PwCIL is not responsible or liable for the acts or omissions of any of its member firms nor can it control the exercise of their professional judgment or bind them in any way. No member firm is responsible or liable for the acts or omissions of any other member firm nor can it control the exercise of another member firm’s professional judgment or bind another member firm or PwCIL in any way. 24 François Royer DIRECTOR, DATA & ANALYTICS 06 43 41 20 85 francois.royer@fr.pwc.com francoisroyer