of Lifeプロジェクトから44万種類・660万枚の画像を入手 • iNat21の1万種類・270万枚の画像や昆虫のデータセットBIOSCAN-1Mも利用 TreeOfLife-10Mデータセット Encyclopedia of Lifeプロジェクト TreeOfLife-10Mに含まれる生物の種類 [24] Alex Fang, Gabriel Ilharco, Mitchell Wortsman, Yuhao Wan, Vaishaal Shankar, Achal Dave, and Ludwig Schmidt. Data determines distributional robustness in contrastive language image pre-training (CLIP). In International Conference on Machine Learning, pages 6216–6234, 2022. [26] Samir Yitzhak Gadre, Gabriel Ilharco, Alex Fang, Jonathan Hayase, Georgios Smyrnis, Thao Nguyen, Ryan Marten, Mitchell Wortsman, Dhruba Ghosh, Jieyu Zhang, et al. DataComp: In search of the next generation of multimodal datasets. arXiv preprint arXiv:2304.14108, 2023. [57] Thao Nguyen, Gabriel Ilharco, Mitchell Wortsman, Sewoong Oh, and Ludwig Schmidt. Quality not quantity: On the interaction between dataset design and robustness of CLIP. In Advances in Neural Information Processing Systems, pages 21455–21469, 2022.