Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[IROS23] Prototypical Contrastive Transfer Learning for Multimodal Language Understanding

[IROS23] Prototypical Contrastive Transfer Learning for Multimodal Language Understanding

More Decks by Semantic Machine Intelligence Lab., Keio Univ.

Other Decks in Technology

Transcript

  1. L Time consuming: e.g., Multiple env., arranging physical objects Motivation:

    Mitigating labor-intensive data collection by domain transfer - 2 - Transfer Simulation Real-world x8
  2. “Look in the left wicker vase “next to the potted

    plant” Task: Multimodal Language Understanding for Fetching Instruction - 3 - “Grasp the glass “in the sink” Transfer Simulation Real-world Binary classification for each object Pos. Neg. Neg. Neg. Neg. Pos. Neg. Neg. Neg.
  3. Prototypical Contrastive Transfer Learning (PCTL) for multimodal language understanding Contribution:

    l Introduce domain transfer to multimodal language understanding l Extend prototypical contrastive loss for classification problems in two domains - 4 - PCL [Li+, ICLR’21] Related work: MCDDA [Saito+, CVPR’18]
  4. Prototypical Contrastive Transfer Learning (PCTL) for multimodal language understanding Contribution:

    l Introduce domain transfer to multimodal language understanding l Extend prototypical contrastive loss for classification problems in two domains - 5 - PCL [Li+, ICLR’21] Related work: MCDDA [Saito+, CVPR’18] Domain transfer for single modality (vision) task
  5. Prototypical Contrastive Transfer Learning (PCTL) for multimodal language understanding Contribution:

    l Introduce domain transfer to multimodal language understanding l Extend prototypical contrastive loss for classification problems in two domains - 6 - PCL [Li+, ICLR’21] Related work: MCDDA [Saito+, CVPR’18] Performs domain transfer based on contrastive learning Inspired by PCL
  6. PCTL: Alleviate domain gap by PCTL: contrastive learning between two

    domains - 7 - Real-world Simulation Feature vectors Feature vectors Clusters’ centroids Clusters’ centroids Contrastive “Clean the top-left “picture above TV” “Pick up the glass “in the sink”
  7. Qualitative results: Correct prediction by PCTL - 8 - “Go

    down the stairs to the “lower balcony area and turn off “the lamp on the dresser.” From REVERIE [Qi+, CVPR’20] #sample: 10342 From ALFRED [Shridhar+, CVPR’20] #sample: 34286 Real-world Transfer “Pick up the “tissue box on the desk“ Simulation
  8. Quantitative results: Outperformed Target Domain Only - 9 - Methods

    Train Test Acc. [%]ˢ Target Domain Only Real Real 73.0±1.87 MCDDA+ [Saito, CVPR’18] Sim Real Real 74.9±3.94 PCTL (Ours) Sim+Real Real 78.1±2.49 Improved by domain transfer +5.1
  9. Quantitative results: Outperformed MCDDA+ - 10 - Methods Train Test

    Acc. [%]ˢ Target Domain Only Real Real 73.0±1.87 MCDDA+ [Saito, CVPR’18] Sim Real Real 74.9±3.94 PCTL (Ours) Sim+Real Real 78.1±2.49 +3.2 Outperformed existing method
  10. Summary: Prototypical Contrastive Transfer Learning (PCTL) Motivation: Mitigating labor-intensive data

    collection by domain transfer Novelty: l Introduce domain transfer to multimodal language understanding l Extend prototypical contrastive loss for classification problems in two domains Result: Outperformed target-domain only condition & existing domain transfer method - 11 -