Computer Vision: Current Conditions and Possibilities for Service Handling

Self Introduction and CV-related efforts by three companies パネリスト自己紹介各社におけるコンピュータービジョン関連の取り組み

岡本大和 Yamato Okamoto LINE Computer Vision Lab Team AI Researcher
Yamato Okamoto joined LINE Corporation in 2021 as a founding member of the newly-established Computer Vision Lab. Yamato studied image recognition at Kyoto University, and worked in new business creation before taking on dual roles in both technology and business. Yamato is in charge of R&D for CLOVA OCR, which recognizes text from document images. Yamato’s , motto is "if you want a different result, it’s crazy not to change your ways”. Yamato works hard every day, striving to make the research center a place people will aspire to work at. Yamato also enjoys playing rugby. Moderator/Panelist

CLOVA OCR CLOVA OCR https://clova.line.me/clova-ocr/

CLOVA OCR converted over 200 million book images into text
data. 国立国会図書館が保有するデジタル化資料 247万点・2億2300万枚超の全文テキストデータ化に「CLOVA OCR」が採用 https://linecorp.com/ja/pr/news/ja/2021/3825

Not only recognize character, but also understand intent. レシート・領収書・請求書に特化したCLOVA OCRが登場！フォーマットの事前設定が不要で項目分類まで対応。
https://blog.clova.line.me/20200923

Future Work: Keep DX safe Document Manipulation Detection Original Manipulated
Manipulated Detection Result

Future Work: Generate Model Generate your handwriting style fonts

Future Work: Generate Model Generate contents 「オクラ」 (input just word)
Generate Model

土井賢治 Kenji Doi Yahoo! JPAN Science Group, Technology Group Machine
Learning Engineer Kenji Doi is involved in the development of image recognition technology in collaboration with various in- house services such as similarity image search and OCR, as well as the application of new methods and technologies that are released on a daily basis. Kenji studies discriminative and generative modeling of Ramen Jiro as a personal project. Panelist

image retrieval https://about.yahoo.co.jp/pr/release/2019/07/03a/

Category Estimation of ad image Image Feature CNN e.g.) ResNet
OCR text data [ “ワンクリックで秒速診断”, “僕でも借りられますか？”, “〇〇銀行カードローン”, … … ] Text Feature Language Model e.g.) BERT FC Consumer loan ?

Image color conversion Diffusion Model original color manipulated images

Manga retrieval

光瀬智哉 Tomoya Kose ZOZO ML / Data Department, Data Science
Section 2 Machine Learning Engineer Tomoya Kose completed a master’s course at the Nara Institute of Science and Technology in 2014, studying natural language processing. Tomoya joined ZOZO in 2018 as a result of a corporate merger, and has since worked in the computer vision field. Tomoya is involved in developing the models used in similar image searches on ZOZOTOWN, maintenance of development flows related to machine learning, and the management of the machine learning engineer team. Panelist

Similar Item Retrieval (Image Retrieval)

Similar Item Retrieval (Image Retrieval) Available data from WEAR Positive
Pair Extract item area with object detection. Get an item image from ZOZOTOWN corresponding to the item worn.

Item Mapping from Outfit to Closet (WEAR)

Outfit Retrieval by Hairstyle

Discussion

What services were difficult to implement? 実装が大変だったサービスは？ Topic-１

Isn't it difficult to adapt to what the users need?
ユーザーニーズに合わせるのって難しくないですか？ Topic-2

How do you collaborate with internal stakeholders in developing your
services? サービス開発にあたり、社内の関係者とどのように連携していますか？ Topic-3

How do you approach multi-modality these days? 昨今のマルチモーダルにどうやって対処していますか？ Topic-4

Thank you

Computer Vision: Current Conditions and Possibi...

Computer Vision: Current Conditions and Possibilities for Service Handling

Tech-Verse2022

More Decks by Tech-Verse2022

Other Decks in Technology

Featured

Transcript

Self Introduction and CV-related efforts by three companies パネリスト自己紹介各社におけるコンピュータービジョン関連の取り組み

岡本大和 Yamato Okamoto LINE Computer Vision Lab Team AI Researcher

CLOVA OCR CLOVA OCR https://clova.line.me/clova-ocr/

CLOVA OCR converted over 200 million book images into text

Not only recognize character, but also understand intent. レシート・領収書・請求書に特化したCLOVA OCRが登場！フォーマットの事前設定が不要で項目分類まで対応。

Future Work: Keep DX safe Document Manipulation Detection Original Manipulated

Future Work: Generate Model Generate your handwriting style fonts

Future Work: Generate Model Generate contents 「オクラ」 (input just word)

土井賢治 Kenji Doi Yahoo! JPAN Science Group, Technology Group Machine

image retrieval https://about.yahoo.co.jp/pr/release/2019/07/03a/

Category Estimation of ad image Image Feature CNN e.g.) ResNet

Image color conversion Diffusion Model original color manipulated images

Manga retrieval

Manga retrieval

光瀬智哉 Tomoya Kose ZOZO ML / Data Department, Data Science

Similar Item Retrieval (Image Retrieval)

Similar Item Retrieval (Image Retrieval) Available data from WEAR Positive

Item Mapping from Outfit to Closet (WEAR)

Outfit Retrieval by Hairstyle

Discussion

What services were difficult to implement? 実装が大変だったサービスは？ Topic-１

Isn't it difficult to adapt to what the users need?

How do you collaborate with internal stakeholders in developing your

How do you approach multi-modality these days? 昨今のマルチモーダルにどうやって対処していますか？ Topic-4

Q&A

Thank you