Upgrade to Pro — share decks privately, control downloads, hide ads and more …

LINE Computer Vision Research: As Is and To Be

LINE Computer Vision Research: As Is and To Be

Minoru Etoh
LINE / AI Company / Technical Advisory
Yoshihisa Ijiri
LINE / Computer vision lab. team / Manager
Kent Fujiwara
LINE / Computer Vision Lab Team / Research Scientist
Yamato Okamoto
LINE / Computer Vision Lab Team / AI Researcher

https://linedevday.linecorp.com/2021/ja/sessions/52
https://linedevday.linecorp.com/2021/en/sessions/52
https://linedevday.linecorp.com/2021/ko/sessions/52

LINE DEVDAY 2021

November 10, 2021
Tweet

More Decks by LINE DEVDAY 2021

Other Decks in Technology

Transcript

  1. Introduction of speakers
    who pioneers LINE CV researches

    View full-size slide

  2. Agenda - Present LINE CV research
    - Future: 2D x 3D x Multimodal

    View full-size slide

  3. Why computer vision?
    7JTVBMJOGPSNBUJPOJTDSVDJBMUPBTTJTUIVNBO
    5IJTSFRVJSFTNBDIJOFTUPTFF VOEFSTUBOE BOEKVEHF

    View full-size slide

  4. CV applications in LINE
    )JHIMZBDDVSBUFGBDFSFDPHOJUJPO
    • F,:$ *%WFSJGJDBUJPO

    • 'BDFHBUF
    LINE CLOVA
    Face
    0CKFDUSFDPHOJUJPOJNBHFSFUSJFWBM
    • l4)011*/(-&/4z
    JOl-*/&TIPQQJOHz
    LINE CLOVA
    Vision
    LINE CLOVA
    OCR
    8PSMEUPQMFWFM0$3
    • 'PSNJOWPJDFSFDFJQUSFBEFS
    • %JHJUBMJ[BUJPOGPS%9

    View full-size slide

  5. 5PQJDTPGUIJTTFTTJPO
    &TUBCMJTINFOUTPGBSPG-*/&$7
    )PXEPXFEFWFMPQBTUSPOHUFBN
    July 2021
    LINE “CVL” START

    View full-size slide

  6. Recent boom of CV researches
    One indicator could be the number of paper submission to the CVPR,
    one of top-level conference in the field…
    0
    1000
    2000
    3000
    4000
    5000
    6000
    7000
    8000
    2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021

    View full-size slide

  7. Step back more in time…
    &UPITBO XBTBDUJWFBEFDBEFBHP UIPVHIʜ%
    0
    1000
    2000
    3000
    4000
    5000
    6000
    7000
    8000
    1996
    1997
    1998
    1999
    2000
    2001
    2002
    2003
    2004
    2005
    2006
    2007
    2008
    2009
    2010
    2011
    2012
    2013
    2014
    2015
    2016
    2017
    2018
    2019
    2020
    2021

    View full-size slide

  8. Recent CV compared to the other fields
    QVCMJDBUJPO I
    /BUVSF
    5IF/FX&OHMBOE+PVSOBMPG.FEJDJOF
    4DJFODF

    *&&&$7'$POGFSFODFPO$PNQVUFS7JTJPOBOE
    1BUUFSO3FDPHOJUJPO

    5IF-BODFU
    "EWBODFE.BUFSJBMT
    $FMM
    /BUVSF$PNNVOJDBUJPOT
    $IFNJDBM3FWJFXT

    *OUFSOBUJPOBM$POGFSFODFPO-FBSOJOH
    3FQSFTFOUBUJPOT

    +"."
    /FVSBM*OGPSNBUJPO1SPDFTTJOH4ZTUFNT
    1SPDFFEJOHTPGUIF/BUJPOBM"DBEFNZPG4DJFODFT
    +PVSOBMPGUIF"NFSJDBO$IFNJDBM4PDJFUZ
    "OHFXBOEUF$IFNJF
    $IFNJDBM4PDJFUZ3FWJFXT
    /VDMFJD"DJET3FTFBSDI
    3FOFXBCMFBOE4VTUBJOBCMF&OFSHZ3FWJFXT
    +PVSOBMPG$MJOJDBM0ODPMPHZ
    1IZTJDBM3FWJFX-FUUFST
    QVCMJDBUJPO I
    "EWBODFE&OFSHZ.BUFSJBMT
    /BUVSF.FEJDJOF
    *OUFSOBUJPOBM$POGFSFODFPO.BDIJOF-FBSOJOH
    &OFSHZ&OWJSPONFOUBM4DJFODF
    "$4/BOP
    4DJFOUJGJD3FQPSUT
    &VSPQFBO$POGFSFODFPO$PNQVUFS7JTJPO
    5IF-BODFU0ODPMPHZ
    "EWBODFE'VODUJPOBM.BUFSJBMT
    1-P4 0/&

    *&&&$7'*OUFSOBUJPOBM$POGFSFODFPO$PNQVUFS
    7JTJPO

    /BUVSF(FOFUJDT
    +PVSOBMPG$MFBOFS1SPEVDUJPO
    /BUVSF.BUFSJBMT
    4DJFODFPG5IF5PUBM&OWJSPONFOU
    $JSDVMBUJPO
    #.+
    +PVSOBMPGUIF"NFSJDBO$PMMFHFPG$BSEJPMPHZ
    "QQMJFE$BUBMZTJT#&OWJSPONFOUBM
    4DJFODF"EWBODFT

    View full-size slide

  9. The latest top-conference: ICCV2021
    0VUPG6236 WBMJETVCNJTTJPOT
    1617 QBQFSTBDDFQUFEʢʣ
    0OMZ210 QBQFSTBDDFQUFEGPSPSBM
    QSFTFOUBUJPOʢʣ

    View full-size slide

  10. Our Research at ICCV 2021
    !"
    #
    !$
    #
    !%
    #
    !$
    !%
    !"
    &&(!
    ")
    &&(!
    $)
    !"
    #
    !$
    #
    !%
    #
    !" !$
    !%
    Ground-truth
    GSLR (ours)
    !"
    #
    !$
    #
    !%
    #
    !" !$
    !%
    SLR
    Nearest
    neighbor
    &&(!%
    )
    Li et al.
    Generalized Shuffled Linear Regression
    Li et al.
    A Closer Look at Rotation-invariant
    Deep Point Cloud Analysis

    View full-size slide

  11. 3D point cloud processing
    *NBHF
    $PPSEJOBUF
    0SEFS
    4DBMF
    1PJOU$MPVE
    3D measurements are typically given by point clouds, but its processing is always challenging..
    $PPSEJOBUF
    0SEFS
    4DBMF



    View full-size slide

  12. 3D point cloud processing
    Matching between point sets important for analysis. But…

    View full-size slide

  13. • (JWFOQPJOUTFU"BOEC TPMWFGPSUSBOTGPSNBUJPOY
    Linear Regression
    !""
    !"#
    ⋯ !"%
    !#"
    !##
    ⋯ !#%

    !'"

    !'#
    ⋱ ⋮
    ⋯ !'%
    ) =
    +"
    +#

    +'
    min
    )
    /) − 1 #
    #
    Needs proper order!

    View full-size slide

  14. • (JWFOQPJOUTFU"BOEC TPMWFGPSUSBOTGPSNBUJPOY
    min
    $, &
    '$ − &) *
    *
    +,,
    +,*
    ⋯ +,.
    +*,
    +**
    ⋯ +*.

    +0,

    +0*
    ⋱ ⋮
    ⋯ +0.
    $ =

    30
    Shuffled Linear Regression
    Needs same number of
    points!
    3*
    3,
    &45
    ∈ {0, 1}
    ;
    4
    &45
    = 1 ;
    5
    &45
    = 1

    View full-size slide

  15. • (JWFOQPJOUTFU"BOEC TPMWFGPSUSBOTGPSNBUJPOY
    !""
    !"#
    ⋯ !"%
    !#"
    !##
    ⋯ !#%

    !'"

    !'#
    ⋱ ⋮
    ⋯ !'%
    ) =

    +'
    Generalized Shuffled Linear Regression
    +#
    +" ,-.
    ∈ {0, 1}
    5
    -
    ,-.
    ≤ 1 5
    .
    ,-.
    ≤ 1
    5
    -,.
    ,-.
    = 7
    Can handle shuffling and
    outliers!
    min
    ), ,
    ;) − ,= #
    #

    View full-size slide

  16. Generalized Shuffled Linear Regression

    View full-size slide

  17. Achieving Rotational Invariance
    model
    DIBJS CFE ʜ UBCMF
    ʜ
    ʜ

    View full-size slide

  18. A Closer Look at Rotation-invariant Deep Point Cloud Analysis

    View full-size slide

  19. Application
    Ex) Pedestrian analysis from
    multiple views

    View full-size slide

  20. Agenda - Present LINE CV research
    - Future: 2D x 3D x Multimodal

    View full-size slide

  21. 3D and Beyond… 4D models!!
    3D + time (4D) = Natural 3D motion generation
    then connection to natural language

    View full-size slide

  22. Document Understanding
    Semantic
    Information
    S-Overtime 50%
    (count) 1
    (unitprice) 20,000
    (price) 20,000
    PBI 1,818
    Subtotal 18,181
    Total 20,000
    Cash 100,000
    Change 80,000
    Tax Included 10%
    Image
    Reference
    - Spatial Dependency Parsing for Semi-Structured
    Document Information Extraction (Clova AI, NAVER Corp)
    https://arxiv.org/abs/2005.00642

    View full-size slide

  23. 2D x Multimodal
    Key Information Extraction

    View full-size slide

  24. Beyond current OCR…
    $IBSBDUFSUZQF
    5FSNJOPMPHZ
    (SBNNBS
    'PSNMBZPVU
    5PQJDTTUZMF
    %PDVNFOUUZQF
    %PNBJO
    LOPXMFEHF
    1VSQPTF UBTL
    $VTUPNFS
    TQFDJGJD
    LOPXMFEHF
    $PNNPO
    LOPXMFEHF
    7JTVBMQBUUFSOT
    $POUFYU 510
    MFWFMPGGPOMZXJUI
    WJTVBMQBUUFSOT
    $PNCJOBUJPOXJUI/-1
    CFDPNFTDSVDJBM
    $IBSBDUFS
    -BOHVBHF
    8PSE

    View full-size slide

  25. Beyond OCR, toward higher-level job automation
    Quality/efficiency enhancement of process with compiled digital data, not only just digitalization
    Contract form
    etc.
    Feed back
    legal review
    results
    Contracts
    in the past

    View full-size slide

  26. Empowerment with dark data!
    Supports fact-based decision
    with making the most of “sleeping” data due to diverse modalities including not only characters, but image/table/graph.
    insight!

    View full-size slide

  27. Our challenge
    Innovation by mixing LINE AI assets, especially NLP, voice/speech, and CV
    .JYFE-*/&"*.J-"*
    .VMUJNPEBMJOQVUPVUQVU

    View full-size slide