Explaining Machine Learning to Establish Validity in Automated Content Analysis

Laura Laugwitz, M. Sc. Digital Media & Technology Explaining Machine
Learning Models to Establish Validity in Automated Content Analysis

Standardized Content Analysis + Supervised Text Classification = Automated Content
Analysis 2 Explaining ML Models to Establish Validity in ACA

Epistemological Differences in Communication & ML Critical Rationalism (Popper 2014)
• Testing hypothesis’ acceptability a priori through logic • Testing hypothesis a posteriori through empirical measurement  quality = reliable, valid and intersubjectively comprehensible knowledge Technocratic Paradigm (Eden 2007) • a priori knowledge about the behavior of a program • Gaining a posteriori knowledge through testing  quality = a satisfactory, reusable application 3 Explaining ML Models to Establish Validity in ACA

Standardized Content Analysis + Supervised Text Classification = Automated Content
Analysis 4 Explaining ML Models to Establish Validity in ACA

Validity Defined „A measuring instrument is considered valid if it
measures what its user claims it measures.“ (Krippendorff, 2019: 361) Explaining ML Models to Establish Validity in ACA 5

Approaches to Explainability 6 Explaining ML Models to Establish Validity
in ACA Output Input model agnostic Output Input model specific Output Input interpretable

Model-agnostic Explainability Methods • PALM (Krishnan and Wu, 2017) •
LIME (Ribeiro et al., 2016b) • Anchors (Ribeiro et al., 2018) • MES (Turner, 2016) • Shapley values (Chen et al., 2019) 8 Explaining ML Models to Establish Validity in ACA

Model-agnostic Explainability Methods • PALM (Krishnan and Wu, 2017) •
LIME (Ribeiro et al., 2016) • Anchors (Ribeiro et al., 2018) • MES (Turner, 2016) • Shapley values (Chen et al., 2019) 10 Explaining ML Models to Establish Validity in ACA

Model-specific Explainability Methods • Rationales (Lei et al., 2016) •
DeepLIFT (Shrikumar et al., 2017) • Integrated gradients (Sundararajan et al., 2017) • VisBERT (Aken et al., 2020) • Distillation (Hinton et al., 2015) • TCAV (Kim et al., 2018) 11 Explaining ML Models to Establish Validity in ACA

Model-specific Explainability Methods • Rationales (Lei et al., 2016) •
DeepLIFT (Shrikumar et al., 2017) • Integrated gradients (Sundararajan et al., 2017) • VisBERT (Aken et al., 2020) • Distillation (Hinton et al., 2015) • TCAV (Kim et al., 2018) 12 Explaining ML Models to Establish Validity in ACA

Interpretable Methods • Naive Bayes (e.g. Stoll et al., 2020)
• Prototypes (Bien and Tibshirani, 2011) • BCM (Kim et al., 2015) 14 Explaining ML Models to Establish Validity in ACA

Using Explainability Methods for Validity Checks? 15 Explaining ML Models
to Establish Validity in ACA Output Input model agnostic Output Input model specific Output Input interpretable

Conclusion • Faithfulness to the model vs. sense-making • Collaboration
on interpretable and text-focussed methods 16 Explaining ML Models to Establish Validity in ACA

Bibliography Aken, Betty van, Benjamin Winter, Alexander Löser, and Felix
A Gers (2020). „VisBERT:Hidden-State Visualizations for Transformers“. In:Companion Proceedings of the WebConference 2020. New York, NY, USA: ACM, pp. 207–211. Bien, Jacob and Robert Tibshirani (2011). „Prototype selection for interpretable classifica-tion“. In:Annals of Applied Statistics5.4, pp. 2403–2424. Chen, Jianbo, Le Song, Martin J. Wainwright, and Michael I. Jordan (2019). „L-Shapleyand C-Shapley: Efficient model interpretation for structured data“. In:7th InternationalConference on Learning Representations, ICLR 2019. New Orleans, United States, pp. 1–17. Eden, Amnon H (2007). „Three Paradigms of Computer Science“. In:Minds and Machines17.2, pp. 135–167. Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean (2015). „Distilling the Knowledge in a NeuralNetwork“. In:NIPS 2014 Deep Learning Workshop, pp. 1–9. arXiv:1503.02531. Kim, Been, Cynthia Rudin, and Julie Shah (2015). „The Bayesian Case Model: A GenerativeApproach for Case-Based Reasoning and Prototype Classification“. In:Advances in NeuralInformation Processing Systems3.January, pp. 1952–1960. Kim, Been, Martin Wattenberg, Justin Gilmer, et al. (2018). „Interpretability beyond fea-ture attribution: Quantitative Testing with Concept Activation Vectors (TCAV)“. In:35thInternational Conference on Machine Learning, ICML 20186, pp. 4186–4195. Krippendorff, Klaus (2019). Content analysis: An introduction to its methodology (4. Ed.) SAGE Publications Ltd. 27.05.2021 17 Explaining ML Models to Establish Validity in ACA

Bibliography Krishnan, Sanjay and Eugene Wu (2017). „PALM“. In:Proceedings of
the 2nd Workshop onHuman-In-the-Loop Data Analytics - HILDA’17. Vol. s3-I. 12. New York, USA: ACM Press,pp. 1–6. Lei, Tao, Regina Barzilay, and Tommi Jaakkola (2016). „Rationalizing Neural Predictions“.In:Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.Stroudsburg, PA, USA: Association for Computational Linguistics, pp. 107–117. Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin (2016a). „Model-Agnostic Inter-pretability of Machine Learning“. In:2016 ICML Workshop on Human Interpretability inMachine Learning Learning (WHI 2016). New York, NY, USA, pp. 91–95. Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin (2018). „Anchors: High-precision model-agnostic explanations“. In:32nd AAAI Conferenceon Artificial Intelligence, AAAI 2018, pp. 1527– 1535. Shrikumar, Avanti, Peyton Greenside, and Anshul Kundaje (2017). „Learning importantfeatures through propagating activation differences“. In:34th International Conference onMachine Learning, ICML 2017. Vol. 7. Sydney, Australia, pp. 4844–4866. Stoll, Anke, Marc Ziegele, and Oliver Quiring (2020). „Detecting Impoliteness and Inci-vility in Online Discussions: Classification Approaches for German User Comments“. In:Computational Communication Research2.1, pp. 109–134. Sundararajan, Mukund, Ankur Taly, and Qiqi Yan (2017). „Axiomatic attribution for deepnetworks“. In:34th International Conference on Machine Learning, ICML 20177, pp. 5109–5118. Turner, Ryan (2016). „A Model Explanation System: Latest Updates and Extensions“. In:ICML Workshop on Human Interpretability in Machine Learning (WHI 2016). New York,NY, USA, pp. 1–5. 27.05.2021 18 Explaining ML Models to Establish Validity in ACA

Explaining Machine Learning to Establish Validi...

Explaining Machine Learning to Establish Validity in Automated Content Analysis

Laura Laugwitz

More Decks by Laura Laugwitz

Other Decks in Research

Featured

Transcript

Laura Laugwitz, M. Sc. Digital Media & Technology Explaining Machine

Standardized Content Analysis + Supervised Text Classification = Automated Content

Epistemological Differences in Communication & ML Critical Rationalism (Popper 2014)

Standardized Content Analysis + Supervised Text Classification = Automated Content

Validity Defined „A measuring instrument is considered valid if it

Approaches to Explainability 6 Explaining ML Models to Establish Validity

Model-agnostic Explainability Methods • PALM (Krishnan and Wu, 2017) •

Model-agnostic Explainability Methods • PALM (Krishnan and Wu, 2017) •

Model-specific Explainability Methods • Rationales (Lei et al., 2016) •

Model-specific Explainability Methods • Rationales (Lei et al., 2016) •

Interpretable Methods • Naive Bayes (e.g. Stoll et al., 2020)

Using Explainability Methods for Validity Checks? 15 Explaining ML Models

Conclusion • Faithfulness to the model vs. sense-making • Collaboration

Bibliography Aken, Betty van, Benjamin Winter, Alexander Löser, and Felix

Bibliography Krishnan, Sanjay and Eugene Wu (2017). „PALM“. In:Proceedings of