Upgrade to Pro — share decks privately, control downloads, hide ads and more …

#22 - Transparence et protection de vie privée à l’ère du big-data

#22 - Transparence et protection de vie privée à l’ère du big-data

Description de la soirée du 25 Avril 2017 - Toulouse Data Science
-----------------------------------------------------------------------------------------
Ethique et algorithme, où est la limite??

Vous souvenez-vous du robot Tay de Microsoft ? Tay est un robot doté d’une intelligence artificielle et capable de l’accroître en conversant avec des internautes via des tweets. En Mars dernier, ses créateurs ont dû le désactiver au bout de quelques heures car il véhiculait des propos racistes, négationnistes ou encore sexistes. Tay était tombé sous l’influence d’interlocuteurs humains mal intentionnés qui lui apprenaient surtout la violence verbale ! Qui blâmer?? Les internautes qui ont provoqué ces dérives, le robot lui-même ou les développeurs de Tay qui n’avaient pas anticipé le risque??

Avez-vous entendu parler de ProPublica qui a recensé le cas d'un algorithme utilisé pour prédire la probabilité qu'un individu commette un crime dans le futur? En l'occurrence, l'algorithme en question a montré que les noirs étaient plus enclins au crime que les blancs après avoir utilisé des données extrêmement biaisées.

Bref, vous l’aurez compris les développeurs portent sur leurs épaules une lourde responsabilité.

Lors de cette soirée, François Royer a présenté les mécanismes permettant de mettre en place des applications de machine learning transparentes et auditables. (cf slides ci-dessus)

Ulrich Aïvodji : Doctorant en Informatique, Ulrich travaille au LAAS-CNRS au sein des équipes ROC(Recherche Opérationnelle, Optimisation Combinatoire et Contraintes) et TSF(Tolérance aux fautes et Sûreté de Fonctionnement). Ses travaux de recherche portent sur la protection de la vie privée dans les services de localisation. Vous trouverez ces slides ici : https://goo.gl/ROqIif

Toulouse Data Science

April 26, 2017
Tweet

More Decks by Toulouse Data Science

Other Decks in Technology

Transcript

  1. Pour une approche transparente, auditable
    et interprétable.
    Apprentissage Machine
    et Intelligence Artificielle
    Toulouse Data Science Meetup 25.04.17
    www.pwc.com

    View Slide

  2. PwC’s Digital Services
    Merci aux
    organisateurs.
    Des rencontres régulières à Toulouse avec des
    data scientists pour discuter du traitement des
    données volumineuses, de son utilisation en
    entreprise et au quotidien, des avantages
    compétitifs possibles et bien sûr des
    technologies pour l’analyse des données
    massives.
    Confidential information for the sole benefit and use of PwC’s client.
    2
    www.meetup.com/fr-FR/
    Tlse-Data-Science
    www.levillagebyca.com

    View Slide

  3. PwC’s Digital Services
    Hello.
    3
    François Royer
    DIRECTEUR, DATA & ANALYTICS
    06 43 41 20 85
    [email protected]
    @francoisroyer

    View Slide

  4. PwC’s Digital Services
    PwC, global leader in Audit and Technology Consulting
    4
    157
    countries
    776
    offices
    people
    195 000
    Turnover in june 2016
    34 Md$
    7 000 Data & Analytics experts incl. 200 in France
    5000
    people

    View Slide

  5. PwC’s Digital Services
    “Help us build the digital future you want to live in.”
    Confidential information for the sole benefit and use of PwC’s client.
    PwC’s Digital Services
    PwC’s Digital Services 5

    View Slide

  6. PwC’s Digital Services
    Les points clés
    La société et les citoyens/consommateurs demandent plus de transparence
    de la part de l’IA et ses applications.
    6
    PwC Consulting | Data & Analytics
    Le Deep Learning, les méthodes d’ensemble = sexy mais peu interprétable?
    Où chercher les méthodes adaptées à une approche maîtrisée des risques?
    (Indice: pas dans la publicité en ligne)
    Nous (Data Scientists, développeurs, experts, décideurs…) sommes
    responsables!

    View Slide

  7. PwC’s Digital Services
    01
    The context: our data-driven society
    Confidential information for the sole benefit and use of PwC’s client.
    Data about service-based business
    models
    Collect first, ask questions later What can possibly go wrong?
    7

    View Slide

  8. 8
    Perfect
    service
    What your
    customers want

    View Slide

  9. PwC’s Digital Services
    So it’s not about the data,
    but the insights and
    value-added services you provide.
    Behavioral data is your competitive advantage!
    Confidential information for the sole benefit and use of PwC’s client.
    PwC’s Digital Services 9

    View Slide

  10. PwC’s Digital Services
    Internet startup says:
    AND ALSO EVERYONE ELSE.
    Confidential information for the sole benefit and use of PwC’s client.
    10

    View Slide

  11. PwC’s Digital Services
    02
    Serious case studies
    Confidential information for the sole benefit and use of PwC’s client.
    Decisions with impact Simple models are used everywhere Towards more transparency
    11

    View Slide

  12. PwC’s Digital Services
    Beware of scoring
    algorithms!
    Are you advertising
    or advising?
    SCORING CAN BECOME DISCRIMINATORY QUICKLY.
    OR WRONG.

    View Slide

  13. PwC’s Digital Services
    Beware of scoring
    algorithms!
    Are you advertising
    or advising?
    SCORING CAN BECOME DISCRIMINATORY QUICKLY.
    OR WRONG.

    View Slide

  14. PwC’s Digital Services
    « Score Cards » of the US Sentencing Commission
    Confidential information for the sole benefit and use of PwC’s client.
    14
    http://www.ussc.gov/guidelines/2015-guidelines-manual/2015-chapter-4

    View Slide

  15. PwC’s Digital Services
    LAPD recruitment model or how do you hire 1000 cops?
    Confidential information for the sole benefit and use of PwC’s client.
    15
    « This simplicity gets at the important
    issue: A decent transparent model
    that is actually used will outperform
    a sophisticated system that predicts
    better but sits on a shelf. If the
    researchers had created a model that
    predicted well but was more complicated,
    the LAPD likely would have ignored it,
    thus defeating the whole purpose »
    -- Greg Ridgeway, Director of National
    Institute of Justice www.rand.org/pubs/research_briefs/RB9447/index1.html
    https://www.rand.org/pubs/monographs/MG881.html

    View Slide

  16. PwC’s Digital Services
    SKYNET and the drone war.
    Confidential information for the sole benefit and use of PwC’s client.
    16
    arstechnica.co.uk/security/2016/02/the-nsas-skynet-program-may-be-killing-thousands-of-innocent-people/

    View Slide

  17. PwC’s Digital Services
    The rows in your
    dataset are real
    people.
    MEET MR ZAIDAN, AL JAZEERA BUREAU CHIEF
    IN ISLAMABAD
    Confidential information for the sole benefit and use of PwC’s client.
    17

    View Slide

  18. PwC’s Digital Services
    DARPA is funding XAI (Explainable Artificial Intelligence)
    Confidential information for the sole benefit and use of PwC’s client.
    18
    The Explainable AI (2016) program
    aims to create a suite of machine
    learning techniques that:
    • Produce more explainable models,
    while maintaining a high level of
    learning performance (prediction
    accuracy);
    • Enable human users to understand,
    appropriately trust, and effectively
    manage the emerging generation of
    artificially intelligent partners. http://www.darpa.mil/program/explainable-artificial-intelligence/

    View Slide

  19. PwC’s Digital Services
    Towards more
    transparency in
    algorithm-based
    decisions.
    Open Fisca, GDPR, APB source codes…

    View Slide

  20. PwC’s Digital Services
    03
    Some tools and techniques
    to facilitate interpretation and auditing.
    Confidential information for the sole benefit and use of PwC’s client.
    Explaining with simple proxies Grpahical models Probabilistic programming
    20

    View Slide

  21. PwC’s Digital Services
    LIME - Local Interpretable Model-Agnostic Explanations
    Confidential information for the sole benefit and use of PwC’s client.
    21
    https://github.com/marcotcr/lime
    "Why Should I Trust You?": Explaining the
    Predictions of Any Classifier
    Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin
    Despite widespread adoption, machine learning models remain
    mostly black boxes. Understanding the reasons behind
    predictions is, however, quite important in assessing trust,
    which is fundamental if one plans to take action based on a
    prediction, or when choosing whether to deploy a new model.
    Such understanding also provides insights into the model, which
    can be used to transform an untrustworthy model or prediction
    into a trustworthy one. In this work, we propose LIME, a novel
    explanation technique that explains the predictions of any
    classifier in an interpretable and faithful manner, by learning an
    interpretable model locally around the prediction.

    View Slide

  22. PwC’s Digital Services
    Remember factor graphs and Bayesian networks?
    Confidential information for the sole benefit and use of PwC’s client.
    22
    Add just the right amount of structure from
    the domain expert (you).
    Let the inference procedure find out the rest
    (unknown variables and parameters).
    Plug any required probability distribution.
    www.microsoft.com/en-us/research/project/trueskill-ranking-system/

    View Slide

  23. PwC’s Digital Services
    Probabilistic programming - or - Can we have high performance bottom-up models?
    PyMC
    Confidential information for the sole benefit and use of PwC’s client.
    23
    Edward Anglican Figaro Church Dimple

    View Slide

  24. PwC’s Digital Services
    Merci de votre attention
    Confidential information for the sole benefit and use of PwC’s client.
    © 2017 PwC. All rights reserved. Not for further distribution without the permission of PwC. “PwC” refers to the network of member firms of
    PricewaterhouseCoopers International Limited (PwCIL), or, as the context requires, individual member firms of the PwC network. Each
    member firm is a separate legal entity and does not act as agent of PwCIL or any other member firm. PwCIL does not provide any services
    to clients. PwCIL is not responsible or liable for the acts or omissions of any of its member firms nor can it control the exercise of their
    professional judgment or bind them in any way. No member firm is responsible or liable for the acts or omissions of any other member firm
    nor can it control the exercise of another member firm’s professional judgment or bind another member firm or PwCIL in any way.
    24
    François Royer
    DIRECTOR, DATA & ANALYTICS
    06 43 41 20 85
    [email protected]
    francoisroyer

    View Slide