社内のAI技術共有会で使用した資料です。
様々なドメインに特化したCLIPモデルについて紹介しました。
以下スライド中の参考文献のリンク
[1] https://arxiv.org/abs/2302.00275
[2] https://arxiv.org/abs/2103.00020
[3] https://arxiv.org/abs/2306.11029
[4] https://arxiv.org/abs/2501.02461
[5] https://arxiv.org/abs/2311.17179
[6] https://arxiv.org/abs/2210.10163
[7] https://arxiv.org/abs/2412.10372
[8] https://arxiv.org/abs/2501.15579
[9] https://arxiv.org/abs/2403.09948
[10] https://imageomics.github.io/bioclip/
[11] https://arxiv.org/abs/2106.13043
[12] https://openaccess.thecvf.com/content/ICCV2023/papers/Zhai_Sigmoid_Loss_for_Language_Image_Pre-Training_ICCV_2023_paper.pdf
[13] https://research.google.com/audioset/
[14] https://arxiv.org/abs/2503.19311
[15] https://graft.cs.cornell.edu/static/pdfs/graft_paper.pdf
[16] https://github.com/visipedia/inat_comp/tree/master/2021
[17] https://biodiversitygenomics.net/projects/1m-insects/