Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Appearances Matter: AI Learning from Text and Images

Appearances Matter: AI Learning from Text and Images

Is a picture really worth a thousand words? Recent progress in AI has in fact been driven by learning from both images and text, opening up exciting new possibilities. Computers are now increasingly able to extract insights from vast quantities of image and text data. This opens up new opportunities, but also leads to notable risks that need to be considered. Can a thoughtful approach towards AI allow us to obtain systems that seem a little more human and better guide us in our everyday life?

Gerard de Melo

November 09, 2022
Tweet

More Decks by Gerard de Melo

Other Decks in Technology

Transcript

  1. Appearances Matter:
    AI Learning from Text and Images
    Gerard de Melo
    http://gerard.demelo.org
    Appearances Matter:
    AI Learning from Text and Images
    Gerard de Melo
    http://gerard.demelo.org
    https://commons.wikimedia.org/wiki/File:Muliple_colored_pencils_07.jpg

    View Slide

  2. https://zh.wikipedia.org/zh-hk/ 柏林

    View Slide

  3. Image: Diana Parkhouse

    View Slide

  4. Image: Diana Parkhouse

    View Slide

  5. Learning from Examples
    Learning from Examples
    Learning from Examples
    Learning from Examples
    cat

    View Slide

  6. Learning from Examples
    Learning from Examples
    Learning from Examples
    Learning from Examples
    KI lernt anhand
    von Beispielen
    AI learns from
    examples.

    View Slide

  7. Learning from Raw Data
    Learning from Raw Data
    Learning from Raw Data
    Learning from Raw Data
    Image: Olga Andreyanova

    View Slide

  8. Learning from Raw Data
    Learning from Raw Data
    Learning from Raw Data
    Learning from Raw Data
    An Example Pair
    to Learn from
    Image: Olga Andreyanova

    View Slide

  9. Learning from Raw Data
    Learning from Raw Data
    Learning from Raw Data
    Learning from Raw Data
    Berlin ist eine lebendige Stadt.

    View Slide

  10. Learning from Raw Data
    Learning from Raw Data
    Learning from Raw Data
    Learning from Raw Data
    Berlin ist eine lebendige Stadt.
    An Example
    to Learn from

    View Slide

  11. Image: https://500px.com/photo/96738645/street-bookstore-by-pablo-tarrero
    Learning from many
    millions of examples

    View Slide

  12. Mouse brain microscopy, adapted from Na Ji, UC Berkeley
    Neural Connections
    in the Brain

    View Slide

  13. Learning Models
    Learning Models
    Learning Models
    Learning Models

    View Slide

  14. Connecting Vision and Language
    Connecting Vision and Language
    Connecting Vision and Language
    Connecting Vision and Language
    x
    x
    x x A cat sitting on a mat
    looking out the door
    https://commons.wikimedia.org/wiki/File:Nebelung_cat_looking_out_door.jpg / https://commons.wikimedia.org/wiki/File:Dog_costume_parade_in_Sunset_Park_%2816432%29.jpg
    A dog wearing a USPS
    mail carrier costume

    View Slide

  15. Generation Models
    Generation Models
    Generation Models
    Generation Models

    View Slide

  16. Generation Example
    Generation Example
    Generation Example
    Generation Example

    View Slide

  17. Generation by
    Generation by
    Combining Vision and Language
    Combining Vision and Language
    Generation by
    Generation by
    Combining Vision and Language
    Combining Vision and Language
    Teddy bears working
    on new AI research
    underwater
    with 1990s technology
    https://commons.wikimedia.org/wiki/File:DALL-E_2_artificial_intelligence_digital_image_generated_photo.jpg

    View Slide

  18. Challenge:
    Challenge:
    Responsible AI
    Responsible AI
    Challenge:
    Challenge:
    Responsible AI
    Responsible AI
    Responsible
    Use of AI
    Image:Kev Costello

    View Slide

  19. Human Understanding
    Human Understanding
    Human Understanding
    Human Understanding
    DeeepMind’s TAP-Vid

    View Slide

  20. Learning from Videos
    Learning from Videos
    Learning from Videos
    Learning from Videos
    Geng et al. Character Matters: Video Story Understanding with Character-Aware Relations

    View Slide

  21. Masked-Piper. Babajide Owoyele, James Trujillo, Gerard de Melo, Wim Pouw (2022)

    View Slide

  22. Understanding Human Perception
    Understanding Human Perception
    Understanding Human Perception
    Understanding Human Perception
    Tugba Kulahcioglu, Gerard de Melo. Predicting Semantic Signatures of Fonts

    View Slide

  23. Understanding Human Perception
    Understanding Human Perception
    Understanding Human Perception
    Understanding Human Perception
    Van Rompay and Pruyn (2011) + de Sousa et al. (2020). Journal of Sensory Studies
    Choice of font
    affects
    brand credibility,
    price expectations,
    even assumed
    acidity, sweetness

    View Slide

  24. Understanding Human Perception
    Understanding Human Perception
    Understanding Human Perception
    Understanding Human Perception
    Tugba Kulahcioglu, Gerard de Melo. FontLex: A Typographical Lexicon based on Affective Associations /
    Tugba Kulahcioglu, Gerard de Melo. Paralinguistic Recommendations for Affective Word Clouds. Proc. IUI
    The Smurfs Scream

    View Slide

  25. Understanding Human Perception
    Understanding Human Perception
    Understanding Human Perception
    Understanding Human Perception
    Contentment Fear
    Dobler et al. Art Creation with Multi-Conditional StyleGANs

    View Slide

  26. Towards Trustworthy AI:
    Towards Trustworthy AI:
    Medical Applications
    Medical Applications
    Towards Trustworthy AI:
    Towards Trustworthy AI:
    Medical Applications
    Medical Applications

    View Slide

  27. Towards Trustworthy AI:
    Towards Trustworthy AI:
    Daily Life Applications
    Daily Life Applications
    Towards Trustworthy AI:
    Towards Trustworthy AI:
    Daily Life Applications
    Daily Life Applications
    Where’s the nearest exit?

    View Slide

  28. Image: Adapted from https://techcrunch.com/2017/12/22/fin-assistant/
    Contact: http://gerard.demelo.org
    linkedin.com/in/gdemelo

    View Slide