yuyu4Tech

yuyu4Tech https://speakerdeck.com/yuyu4tech 2026-04-13 13:51:19 -0400 【論文紹介】DINOv3: Self-supervised Learning for Vision at Unprecedented Scale DINOv3 is Meta AI's latest vision foundation model that pushes self-supervised learning to an unprecedented scale. This talk introduces the key ideas behind DINOv3, including: Large-scale data curation (17B images) Self-supervised pre-training with DINO and iBOT objectives Gram Anchoring for dense feature preservation High-resolution adaptation up to 4K+ inference Efficient multi-student distillation We will explore how these innovations enable DINOv3 to achieve state-of-the-art performance across a broad range of computer vision tasks while improving scalability, robustness, and deployment efficiency. DINOv3 is Meta AI's latest vision foundation model that pushes self-supervised learning to an unprecedented scale. This talk introduces the key ideas behind DINOv3, including: Large-scale data curation (17B images) Self-supervised pre-training with DINO and iBOT objectives Gram Anchoring for dense feature preservation High-resolution adaptation up to 4K+ inference Efficient multi-student distillation We will explore how these innovations enable DINOv3 to achieve state-of-the-art performance across a broad range of computer vision tasks while improving scalability, robustness, and deployment efficiency. Tue, 02 Jun 2026 00:00:00 -0400 https://speakerdeck.com/yuyu4tech/lun-wen-shao-jie-dinov3-self-supervised-learning-for-vision-at-unprecedented-scale https://speakerdeck.com/yuyu4tech/lun-wen-shao-jie-dinov3-self-supervised-learning-for-vision-at-unprecedented-scale 【論文紹介】DINOv2: Seeing Without Supervision A deep dive into Meta AI's self-supervised vision foundation model — exploring how DINOv2 learns robust visual features from 142M curated images without any labels, and why it rivals weakly-supervised methods across classification, segmentation, depth estimation, and beyond. A deep dive into Meta AI's self-supervised vision foundation model — exploring how DINOv2 learns robust visual features from 142M curated images without any labels, and why it rivals weakly-supervised methods across classification, segmentation, depth estimation, and beyond. Mon, 13 Apr 2026 00:00:00 -0400 https://speakerdeck.com/yuyu4tech/lun-wen-shao-jie-dinov2-seeing-without-supervision https://speakerdeck.com/yuyu4tech/lun-wen-shao-jie-dinov2-seeing-without-supervision