#22 - Transparence et protection de vie privée à l’ère du big-data

Pour une approche transparente, auditable et interprétable. Apprentissage Machine et
Intelligence Artificielle Toulouse Data Science Meetup 25.04.17 www.pwc.com

PwC’s Digital Services Merci aux organisateurs. Des rencontres régulières à
Toulouse avec des data scientists pour discuter du traitement des données volumineuses, de son utilisation en entreprise et au quotidien, des avantages compétitifs possibles et bien sûr des technologies pour l’analyse des données massives. Confidential information for the sole benefit and use of PwC’s client. 2 www.meetup.com/fr-FR/ Tlse-Data-Science www.levillagebyca.com

PwC’s Digital Services Hello. 3 François Royer DIRECTEUR, DATA &
ANALYTICS 06 43 41 20 85 [email protected] @francoisroyer

PwC’s Digital Services PwC, global leader in Audit and Technology
Consulting 4 157 countries 776 offices people 195 000 Turnover in june 2016 34 Md$ 7 000 Data & Analytics experts incl. 200 in France 5000 people

PwC’s Digital Services “Help us build the digital future you
want to live in.” Confidential information for the sole benefit and use of PwC’s client. PwC’s Digital Services PwC’s Digital Services 5

PwC’s Digital Services Les points clés La société et les
citoyens/consommateurs demandent plus de transparence de la part de l’IA et ses applications. 6 PwC Consulting | Data & Analytics Le Deep Learning, les méthodes d’ensemble = sexy mais peu interprétable? Où chercher les méthodes adaptées à une approche maîtrisée des risques? (Indice: pas dans la publicité en ligne) Nous (Data Scientists, développeurs, experts, décideurs…) sommes responsables!

PwC’s Digital Services 01 The context: our data-driven society Confidential
information for the sole benefit and use of PwC’s client. Data about service-based business models Collect first, ask questions later What can possibly go wrong? 7

8 Perfect service What your customers want

PwC’s Digital Services So it’s not about the data, but
the insights and value-added services you provide. Behavioral data is your competitive advantage! Confidential information for the sole benefit and use of PwC’s client. PwC’s Digital Services 9

PwC’s Digital Services Internet startup says: AND ALSO EVERYONE ELSE.
Confidential information for the sole benefit and use of PwC’s client. 10

PwC’s Digital Services 02 Serious case studies Confidential information for
the sole benefit and use of PwC’s client. Decisions with impact Simple models are used everywhere Towards more transparency 11

PwC’s Digital Services Beware of scoring algorithms! Are you advertising
or advising? SCORING CAN BECOME DISCRIMINATORY QUICKLY. OR WRONG.

PwC’s Digital Services « Score Cards » of the US
Sentencing Commission Confidential information for the sole benefit and use of PwC’s client. 14 http://www.ussc.gov/guidelines/2015-guidelines-manual/2015-chapter-4

PwC’s Digital Services LAPD recruitment model or how do you
hire 1000 cops? Confidential information for the sole benefit and use of PwC’s client. 15 « This simplicity gets at the important issue: A decent transparent model that is actually used will outperform a sophisticated system that predicts better but sits on a shelf. If the researchers had created a model that predicted well but was more complicated, the LAPD likely would have ignored it, thus defeating the whole purpose » -- Greg Ridgeway, Director of National Institute of Justice www.rand.org/pubs/research_briefs/RB9447/index1.html https://www.rand.org/pubs/monographs/MG881.html

PwC’s Digital Services SKYNET and the drone war. Confidential information
for the sole benefit and use of PwC’s client. 16 arstechnica.co.uk/security/2016/02/the-nsas-skynet-program-may-be-killing-thousands-of-innocent-people/

PwC’s Digital Services The rows in your dataset are real
people. MEET MR ZAIDAN, AL JAZEERA BUREAU CHIEF IN ISLAMABAD Confidential information for the sole benefit and use of PwC’s client. 17

PwC’s Digital Services DARPA is funding XAI (Explainable Artificial Intelligence)
Confidential information for the sole benefit and use of PwC’s client. 18 The Explainable AI (2016) program aims to create a suite of machine learning techniques that: • Produce more explainable models, while maintaining a high level of learning performance (prediction accuracy); • Enable human users to understand, appropriately trust, and effectively manage the emerging generation of artificially intelligent partners. http://www.darpa.mil/program/explainable-artificial-intelligence/

PwC’s Digital Services Towards more transparency in algorithm-based decisions. Open
Fisca, GDPR, APB source codes…

PwC’s Digital Services 03 Some tools and techniques to facilitate
interpretation and auditing. Confidential information for the sole benefit and use of PwC’s client. Explaining with simple proxies Grpahical models Probabilistic programming 20

PwC’s Digital Services LIME - Local Interpretable Model-Agnostic Explanations Confidential
information for the sole benefit and use of PwC’s client. 21 https://github.com/marcotcr/lime "Why Should I Trust You?": Explaining the Predictions of Any Classifier Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction, or when choosing whether to deploy a new model. Such understanding also provides insights into the model, which can be used to transform an untrustworthy model or prediction into a trustworthy one. In this work, we propose LIME, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally around the prediction.

PwC’s Digital Services Remember factor graphs and Bayesian networks? Confidential
information for the sole benefit and use of PwC’s client. 22 Add just the right amount of structure from the domain expert (you). Let the inference procedure find out the rest (unknown variables and parameters). Plug any required probability distribution. www.microsoft.com/en-us/research/project/trueskill-ranking-system/

PwC’s Digital Services Probabilistic programming - or - Can we
have high performance bottom-up models? PyMC Confidential information for the sole benefit and use of PwC’s client. 23 Edward Anglican Figaro Church Dimple

PwC’s Digital Services Merci de votre attention Confidential information for
the sole benefit and use of PwC’s client. © 2017 PwC. All rights reserved. Not for further distribution without the permission of PwC. “PwC” refers to the network of member firms of PricewaterhouseCoopers International Limited (PwCIL), or, as the context requires, individual member firms of the PwC network. Each member firm is a separate legal entity and does not act as agent of PwCIL or any other member firm. PwCIL does not provide any services to clients. PwCIL is not responsible or liable for the acts or omissions of any of its member firms nor can it control the exercise of their professional judgment or bind them in any way. No member firm is responsible or liable for the acts or omissions of any other member firm nor can it control the exercise of another member firm’s professional judgment or bind another member firm or PwCIL in any way. 24 François Royer DIRECTOR, DATA & ANALYTICS 06 43 41 20 85 [email protected] francoisroyer

#22 - Transparence et protection de vie privée ...

#22 - Transparence et protection de vie privée à l’ère du big-data

Toulouse Data Science

More Decks by Toulouse Data Science

Other Decks in Technology

Featured

Transcript

Pour une approche transparente, auditable et interprétable. Apprentissage Machine et

PwC’s Digital Services Merci aux organisateurs. Des rencontres régulières à

PwC’s Digital Services Hello. 3 François Royer DIRECTEUR, DATA &

PwC’s Digital Services PwC, global leader in Audit and Technology

PwC’s Digital Services “Help us build the digital future you

PwC’s Digital Services Les points clés La société et les

PwC’s Digital Services 01 The context: our data-driven society Confidential

8 Perfect service What your customers want

PwC’s Digital Services So it’s not about the data, but

PwC’s Digital Services Internet startup says: AND ALSO EVERYONE ELSE.

PwC’s Digital Services 02 Serious case studies Confidential information for

PwC’s Digital Services Beware of scoring algorithms! Are you advertising

PwC’s Digital Services Beware of scoring algorithms! Are you advertising

PwC’s Digital Services « Score Cards » of the US

PwC’s Digital Services LAPD recruitment model or how do you

PwC’s Digital Services SKYNET and the drone war. Confidential information

PwC’s Digital Services The rows in your dataset are real

PwC’s Digital Services DARPA is funding XAI (Explainable Artificial Intelligence)

PwC’s Digital Services Towards more transparency in algorithm-based decisions. Open

PwC’s Digital Services 03 Some tools and techniques to facilitate

PwC’s Digital Services LIME - Local Interpretable Model-Agnostic Explanations Confidential

PwC’s Digital Services Remember factor graphs and Bayesian networks? Confidential

PwC’s Digital Services Probabilistic programming - or - Can we

PwC’s Digital Services Merci de votre attention Confidential information for