Upgrade to Pro — share decks privately, control downloads, hide ads and more …

OpenTalks.AI - Ольга Перепелкина, Affective computing

OpenTalks.AI - Ольга Перепелкина, Affective computing

OpenTalks.AI

March 01, 2018
Tweet

More Decks by OpenTalks.AI

Other Decks in Business

Transcript

  1. Plan 1. Affective computing and Social signal processing 2. Emotions:

    why we use multimodal approach? 3. Steps for automatic emotion recognition 4. Get data: what kind of data could we use and how to annotate data? 5. Feature extraction approaches 6. Multichannel data fusion 7. Next steps and trends in affective computing
  2. Plan 1. Affective computing and Social signal processing 2. Emotions:

    why we use multimodal approach? 3. Steps for automatic emotion recognition 4. Get data: what kind of data could we use and how to annotate data? 5. Feature extraction approaches 6. Multichannel data fusion 7. Next steps and trends in affective computing
  3. Affective Computing: Picard, 1995 Social Signal Processing: Viciarelli et al,

    2009 Affective computing lab in MIT (MIT media lab, Cambridge) http://affect.media.mit.edu/ Intelligent behavior understanding group (ICL. London) https://ibug.doc.ic.ac.uk/
  4. Affective Computing • Affective computing (artificial emotional intelligence, or emotion

    AI) is the study and development of systems and devices that can recognize, interpret, process, and simulate human affects.
  5. Social Signal Processing Interpersonal Distance Gesture Forward Posture Height Mutual

    Gaze Vocal Behavior Forward Posture Nonverbal behavior cues Social Signal Vinciarelli et al. Social signal processing: Survey of an emerging domain, 2009
  6. Plan 1. Affective computing and Social signal processing 2. Emotions:

    why we use multimodal approach? 3. Steps for automatic emotion recognition 4. Get data: what kind of data could we use and how to annotate data? 5. Feature extraction approaches 6. Multichannel data fusion 7. Next steps and trends in affective computing
  7. Emotions: how do people recognize them? • People can recognize

    emotions by separate channels: voice, body language, touches, faces (the best accuracy). • Other modalities such as smell and taste also take into account during emotion recognition. • Visual and auditory modalities affect each other (e.g. facial movements around mouth region impact vocal acoustics). • fMRI, ERP studies: compared with unimodal presentations (e.g., face only), multimodal presentations (e.g., face and voice) yield faster and more accurate emotion judgments. Schirmer, Annett, and Ralph Adolphs. "Emotion perception from face, voice, and touch: comparisons and convergence." Trends in Cognitive Sciences (2017).
  8. Multimodal affective computing • Most researches: faces, less – voices.

    Even less: texts, body, physiology. • Accuracy at Multimodal data is higher than at Unimodal data: on 9,83% in average, for 85% systems. • We do not know which type and how much “channels” we need for the best classification. • The contribution of individual channels can be different: for example, models based only on audio recognized fear better, and models on visual signs – recognized disgust better. D'mello et al., 2015; Osman et al., 2017
  9. Multimodal Affective Computing Speech Data Visual Data Vocal Affect Body

    Gestures Facial Expressions Eye movements Feature Extraction Emotion Classification Physiology Data EDA / GSR Blood pressure
  10. Multimodal Affective Computing Speech Data Visual Data Vocal Affect Body

    Gestures Facial Expressions Eye movements Feature Extraction Emotion Classification Physiology Data EDA / GSR Blood pressure Computer Vision
  11. Plan 1. Affective computing and Social signal processing 2. Emotions:

    why we use multimodal approach? 3. Steps for automatic emotion recognition 4. Get data: what kind of data could we use and how to annotate data? 5. Feature extraction approaches 6. Multichannel data fusion 7. Next steps and trends in affective computing
  12. Steps for automatic emotion recognition Choose Data Define Categories Annotate

    Data Preprocess Data Extract Features Model Fusion
  13. Plan 1. Affective computing and Social signal processing 2. Emotions:

    why we use multimodal approach? 3. Steps for automatic emotion recognition 4. Get data: what kind of data could we use and how to annotate data? 5. Feature extraction approaches 6. Multichannel data fusion 7. Next steps and trends in affective computing
  14. Plan 1. Affective computing and Social signal processing 2. Emotions:

    why we use multimodal approach? 3. Steps for automatic emotion recognition 4. Get data: what kind of data could we use and how to annotate data? 5. Feature extraction approaches 6. Multichannel data fusion 7. Next steps and trends in affective computing
  15. Face & Eyes recognition Human face Face Face Detection (Faster

    RCNN) Face Identification (ResNet 50) Feature Extraction Eyes Eyes and Nose Detection (OpenCV + CNN) Open/closed Eyes
  16. Body pose estimation • Human 2D pose estimation – the

    problem of localizing anatomical keypoints (body parts). • Challenges: • Each image may contain an unknown number of people • Interaction between people: contact, occlusion, limb articulation etc. • Runtime complexity tends to grow with number of people in the image • Input: a color image, output: the 2D locations of anatomical keypoints for each person in the image.
  17. Body pose estimation approaches Top-down •Detect person => find body

    parts •Single-person pose estimation •Runtime is proportional to the number of people •If the person detector fails – no resource to recovery Bottom-up •Find keypoints & connections => construct person •Multi-person pose estimation •Robust to early commitment and have the potential to decouple runtime complexity from the number of people in the image Cao Z. et al. Realtime multi-person 2d pose estimation using part affinity fields //CVPR. – 2017. – Т. 1. – №. 2. – С. 7.
  18. Plan 1. Affective computing and Social signal processing 2. Emotions:

    why we use multimodal approach? 3. Steps for automatic emotion recognition 4. Get data: what kind of data could we use and how to annotate data? 5. Feature extraction approaches 6. Multichannel data fusion 7. Next steps and trends in affective computing
  19. Feature-based fusion • Integrates features immediately after the are extracted.

    • Training of a single model. • Problems: • Features from different channels have different time scales • Large set of features from different channels => higher computing load • Vulnerable approach in the case of Missing data
  20. Decision-based fusion • Performs integration after each of the modalities

    has made a decision. • Averaging, voting schemes, weighting sum, etc. • Problems: • Decision-based fusion ignores the low level interaction between the modalities.
  21. Hybrid fusion • Combines outputs from early fusion and individual

    unimodal predictors. • E.g., two steps combination: 1) feature-level fusion from audio + video, 2) decision-level fusion from the first classifier + one more classifier for physiological data.
  22. Multimodal fusion • Lingenfelser et al. conducted systematic research of

    feature-level, decision-level and hybrid-fusion approaches and did not find evidence of the advantages of any of the techniques over others. • Lingenfelser, Florian, Johannes Wagner, and Elisabeth André. "A systematic discussion of fusion techniques for multi-modal affect recognition tasks." Proceedings of the 13th international conference on multimodal interfaces. ACM, 2011 • Nevertheless, the decision-level approach seems more reasonable, because it deals better with missing data. • Al Osman, Hussein, and Tiago H. Falk. "Multimodal Affect Recognition: Current Approaches and Challenges." Emotion and Attention Recognition Based on Biological Signals and Images. InTech, 2017
  23. Plan 1. Affective computing and Social signal processing 2. Emotions:

    why we use multimodal approach? 3. Steps for automatic emotion recognition 4. Get data: what kind of data could we use and how to annotate data? 5. Feature extraction approaches 6. Multichannel data fusion 7. Next steps and trends in affective computing
  24. Natural Emotions Mixed Emotions Social Signals Multimodal Data Wearable gadgets

    & Smartphones Affective computing: trends Face Eyes … Voice Body