LINEの3D認識技術と今後の展望 井尻善久(LINE株式会社) MLAI-TALK #1 での発表資料です https://line.connpass.com/event/231314/
LINEの3D認識技術と今後の展望LINE CVLYoshihisa IJIRI
View Slide
> ઐɿίϯϐϡʔλϏδϣϯɾϩϘςΟΫε> 0VUEPPSొࢁɾεΩʔɾୌ८ΓɾࣸਅࡱӨɾόΠΫτϥΠΞϧɾɾɾ> *OEPPSϐΞϊԋɾྺ࢙ɾᗉɾίʔώʔᖿઝɾΟεΩʔɺΫϥϑτϏʔϧɾɾɾ> ΦϜϩϯೖࣾ> إͷݕग़ೝࣝͷσδΧϝɾܞଳిɺࢹΧϝϥԠ༻> ମݕग़ɾŤŞƄŸƃũŖŢŔƃɾ0$3ͷ'"͚Խ> ͠ͳ͔ͳ੍ޚΛ࣮ݱ͢ΔࣗιϑτϩϘοτݚڀਪਐ> Ϧαʔνϕϯνϟʔ্ཱͪ͛ 0.30/4*/*$9> -*/&ೖࣾ> $PNQVUFS7JTJPO-BCͷ্ཱͪ͛-*/&גࣜձࣾ "*Χϯύχʔ"*։ൃࣨ ࣨɺ$PNQVUFS7JTJPO-BC Ϛωʔδϟʔ:PTIJIJTB*KJSJ 1I%
"*ٕज़ͷաͿΓ0100020003000400050006000700080001996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021$713จߘͷਪҠ出展:IEEE digital library各年度proceedingsこれらを元に独⾃に集計し作成
ίϯϐϡʔλϏδϣϯٕज़ͷաͿΓQVCMJDBUJPO I /BUVSF 5IF/FX&OHMBOE+PVSOBMPG.FEJDJOF 4DJFODF *&&&$7'$POGFSFODFPO$PNQVUFS7JTJPOBOE1BUUFSO3FDPHOJUJPO 5IF-BODFU "EWBODFE.BUFSJBMT $FMM /BUVSF$PNNVOJDBUJPOT $IFNJDBM3FWJFXT *OUFSOBUJPOBM$POGFSFODFPO-FBSOJOH3FQSFTFOUBUJPOT +"." /FVSBM*OGPSNBUJPO1SPDFTTJOH4ZTUFNT 1SPDFFEJOHTPGUIF/BUJPOBM"DBEFNZPG4DJFODFT +PVSOBMPGUIF"NFSJDBO$IFNJDBM4PDJFUZ "OHFXBOEUF$IFNJF $IFNJDBM4PDJFUZ3FWJFXT /VDMFJD"DJET3FTFBSDI 3FOFXBCMFBOE4VTUBJOBCMF&OFSHZ3FWJFXT +PVSOBMPG$MJOJDBM0ODPMPHZ 1IZTJDBM3FWJFX-FUUFST QVCMJDBUJPO I "EWBODFE&OFSHZ.BUFSJBMT /BUVSF.FEJDJOF *OUFSOBUJPOBM$POGFSFODFPO.BDIJOF-FBSOJOH &OFSHZ&OWJSPONFOUBM4DJFODF "$4/BOP 4DJFOUJGJD3FQPSUT &VSPQFBO$POGFSFODFPO$PNQVUFS7JTJPO 5IF-BODFU0ODPMPHZ "EWBODFE'VODUJPOBM.BUFSJBMT 1-P4 0/& *&&&$7'*OUFSOBUJPOBM$POGFSFODFPO$PNQVUFS7JTJPO /BUVSF(FOFUJDT +PVSOBMPG$MFBOFS1SPEVDUJPO /BUVSF.BUFSJBMT 4DJFODFPG5IF5PUBM&OWJSPONFOU $JSDVMBUJPO #.+ +PVSOBMPGUIF"NFSJDBO$PMMFHFPG$BSEJPMPHZ "QQMJFE$BUBMZTJT#&OWJSPONFOUBM 4DJFODF"EWBODFT 出展:Google Scholar
5LINEの藤原師匠CVPRやICCVにバンバン通していた(それで私もジョインした!)
ͳͥ܈ʁ
ը૾✔ ࠲ඪܥ✔ ॱং✔ εέʔϧ܈ʁ ࠲ඪܥʁ ॱংʁ εέʔϧ܈ͷ͠͞
՝ΛΓӽ͑ΔͨΊओʹΞϓϩʔνଘࡏPoint-based: Qi et al. [CVPR 2017]Alternative representation: Sinha et al. [ECCV 2016]Voxel based: Wu et al. [CVPR 2015]Image-based: Kanezaki et al. [CVPR 2018] αΠζ ϝογϡ͕ඞཁ લॲཧ͕ඞཁ ࠲ඪؔͱͯ͠දݱ͢ΕΑ͍ͷͰʁओͳղੳख๏
• ಛͱͯ͠ωοτϫʔΫͷॏΈΛར༻• લॲཧʴಛघͳωοτϫʔΫΛ࠾༻͢Δ͜ͱʹΑΓ࠲ඪɾεέʔϧෆมʹhttps://github.com/kentfuji/NeuralEmbedding/FVSBM*NQMJDJU&NCFEEJOHGPS1PJOU$MPVE"OBMZTJT<'VKJXBSB$713>
ը૾✔ ࠲ඪܥ✔ ॱং✔ εέʔϧ܈ʁ ࠲ඪܥ✔ ॱং✔ εέʔϧ܈ͷ͠͞
modelchair bed ʜ table ʜ ʜ ճసෆมͷ࣮ݱ
A Closer Look at Rotation-Invariant Deep Point Cloud Analysis[Li and Fujiwara+, ICCV2021]
• ओੳͷ࣠ͷΈ߹ΘͤͰճసΛࣔ͢ͷΛಛఆ• 4FMFDUPSϞδϡʔϧΛఏҊ͠࠷దͳ࢟ͷநग़Λ࣮ݱMLPmajornetworkpoolingsoftmax!243NN3A Closer Look at Rotation-Invariant Deep Point Cloud Analysis[Li and Fujiwara+, ICCV2021]
ը૾✔ ࠲ඪܥ✔ ॱং✔ εέʔϧ܈✔ ࠲ඪܥ✔ ॱং✔εέʔϧ܈ͷ͠͞
ϊΠζͷଘࡏରԠ͕ؔະͷ߹ɽɽɽ
𝐴!! 𝐴!"⋯ 𝐴!#𝐴"! 𝐴""⋯ 𝐴"#⋮𝐴$!⋮𝐴$"⋱ ⋮⋯ 𝐴$#𝐱 =𝑏!𝑏"⋮𝑏$min𝐱𝐀𝐱 − 𝐛 ""܈" C͕༩͑ΒΕͨ߹ɼมYΛٻΊΔ順序が必要!ઢܗճؼ (Linear Regression)
܈" C͕༩͑ΒΕͨ߹ɼஔߦྻ1ͱมYΛٻΊΔ同じ点数が必要!min𝐱, 𝐏𝐀𝐱 − 𝐏𝐛 ""𝐴!!𝐴!"⋯ 𝐴!#𝐴"!𝐴""⋯ 𝐴"#⋮𝐴$!⋮𝐴$"⋱ ⋮⋯ 𝐴$#𝐱 =⋮𝑏$𝑏"𝑏!𝐏() ∈ {0, 1}5(𝐏() = 1 5)𝐏() = 1Shuffled Linear Regression [Ashwin+, 2017]
܈" C͕༩͑ΒΕͨ߹ɼஔߦྻ1ͱมYΛٻΊΔ外れ値と順序の特定が可能!min𝐱, 𝐏𝐀𝐱 − 𝐏𝐛 ""𝐴!!𝐴!"⋯ 𝐴!#𝐴"!𝐴""⋯ 𝐴"#⋮𝐴$!⋮𝐴$"⋱ ⋮⋯ 𝐴$#𝐱 =⋮𝑏$𝑏"𝑏!𝐏()∈ {0, 1}5(𝐏()≤ 1 5)𝐏() ≤ 15(,)𝐏() = 𝑘Generalized Shuffled Linear Regression [Li and Fujiwara+, ICCV2021]
• ֊མͪͷஔߦྻѻ͑Δ b4IVGGMFE-JOFBS3FHSFTTJPO`• ܈͚ͩͰͳ͘ը૾ͷಛͳͲʹରԠՄೳSource Target GSLR (ours)Feature matchingw/ RANSAC SLRSource FMAP ICP BCICP ZoomOut-100 GSLR (ours)94.4%62.0%83.4%52.6%37.1%(FOFSBMJ[FE4IVGGMFE-JOFBS3FHSFTTJPO<-JBOE'VKJXBSB *$$7>
̏DʴTime = MotionMotion + Linguistics = Cmd2motion͜Ε͔Βɾɾɾ
21-*/&$7-ͷྗٕज़ $79ٕज़ࣗવݴޠॲཧೖྗԻσδλϧςΩετը૾ಈը3(#%5ݴޠςΩετը૾ਤදԻೝࣝ$713ॲཧ ੜԻ߹$(ςΩετग़ྗ5F9ͳͲϚϧνϝσΟΞೖྗʹରԠ͢ΔϚϧνϞʔμϧॲཧ"*ٕज़ $7Y˓˓ٕज़͕ॏཁʹʂʢ$7Λத৺ͱͯ͠Έͨͱ͖ͷϚϧνϞʔμϧ"*ٕज़ͷҙຯͰԬຊࢯ͕$79ٕज़ͱ໋໊ʣ
%PDVNFOU6OEFSTUBOEJOH "*0$3SemanticInformationS-Overtime 50%(count) 1(unitprice)20,000(price) 20,000PBI 1,818Subtotal 18,181Total 20,000Cash 100,000Change 80,000Tax Included 10%ImageSpatial Dependency Parsing for Semi-Structured DocumentInformation Extraction [Hwang+, ACL2020]
23-BZPVUSFDPHOJUJPOςΩετͷϨΠΞτΛೝࣝ͢Δ͜ͱͰϑΟʔϧυݕࡧΛՄೳͱ͢Δ
#FZPOEDVSSFOU"*0$3ʜ$IBSBDUFSUZQF5FSNJOPMPHZ(SBNNBS'PSNMBZPVU5PQJDTTUZMF%PDVNFOUUZQF%PNBJOLOPXMFEHF1VSQPTF UBTL$VTUPNFSTQFDJGJDLOPXMFEHF$PNNPOLOPXMFEHF7JTVBMQBUUFSOT$POUFYU 510MFWFMPGGPOMZXJUIWJTVBMQBUUFSOT$PNCJOBUJPOXJUI/-1CFDPNFTDSVDJBM$IBSBDUFS-BOHVBHF8PSE
25岡本さんから次回以降に紹介!
26-*/&$7-ͷྗٕज़ $79ٕज़ࣗવݴޠॲཧೖྗԻσδλϧςΩετը૾ಈը3(#%5ݴޠςΩετը૾ਤදԻೝࣝ$713ॲཧ ੜԻ߹$(ςΩετग़ྗ5F9ͳͲϚϧνϝσΟΞೖྗʹରԠ͢ΔϚϧνϞʔμϧॲཧ"*ٕज़ $7Y˓˓ٕज़͕ॏཁʹʂʢ$7Λத৺ͱͯ͠Έͨͱ͖ͷϚϧνϞʔμϧ"*ٕज़ͷҙຯͰԬຊࢯ͕$79ٕज़ͱ໋໊ʣ
STRICTLY CONFIDENTIAL-*/&"*$PNQBOZͷࢦ͢ੈքʮͻͱʹ͍͞͠"*ʯ͕ɺੜ׆ϏδωεʹજΉΘ͠͞Λղফ͠ɺʮ͜Ε͔Βͷ͋ͨΓ·͑ʯΛΓ·͢ɻ"*ΧϯύχʔͰɺ-*/&ͷͭ"*ٕज़Λফඅऀ͚͔Β๏ਓ͚·Ͱ෯͘ల։͍ͯ͠·͢ɻอ༗͢Δٕज़ʹࣗવݴޠॲཧɺจࣈɺը૾ɺإɺԻͷೝࣝԻ߹ͳͲ͕͋Γɺࣾձاۀͷ՝χʔζʹ߹Θͤͯઃܭ͔Β࣮·ͰΛߦ͍ɺ"*ͷࣾձਁಁΛਪਐ͍ͯ͠·͢ɻͦΜͳࢲͨͪɺʮΑΓࣗવͳϢʔβʔମݧΛ -JGFPO-*/& ʹͨΒ͢͜ͱͰ ͜Ε͔Βͷ͋ͨΓ·͑Λͭ͘Γͩ͢ʯͱ͍͏7JTJPOΛ࣋ͬͯʑΛա͍ͯ͝͠·͢ɻϏδωεͱ"*ɺਓͱ"*ͷڑΛ͚ۙͮɺʑͷۀͦͷઌͷਓʑͷੜ׆ʹدΓఴ͏ʮ͜Ε͔Βͷ͋ͨΓ·͑ʯΛग़͠ɺΑΓศརͳࣾձΛ࣮ݱ͠·͢ɻ
STRICTLY CONFIDENTIAL$-07"$IBUCPU-*/&͔ΜͨΜϔϧϓ$-07"Ͱഓͬͨࣗવݴޠٕज़Λɺ'"2٬༻#PUʹల։Ͱ͖ΔαʔϏεLINE CLOVAChatzbot$-07"0$3ࠃࡍձٞͰੈք࠷ߴਫ४ͱೝΊΒΕͨOCRٕज़ΛਃࠐॻྖऩॻͳͲͷಡΈऔΓɺࣗಈೖྗʹ׆༻Ͱ͖ΔαʔϏεLINE CLOVAOCR$-07"4QFFDI$-07"ͷԻೝٕࣝज़Λ׆༻͠ɺిಈըϝσΟΞͷԻॻ͖ى͜͠ɺిԠରͷࣗಈԽαʔϏεͳͲΛఏڙLINE CLOVASpeech$-07"7PJDF$-07"ͷԻ߹ٕज़Λ׆༻͠ɺاۀϒϥϯυ༻్ʹ͋ͬͨԻϞσϧΛ࡞͢ΔαʔϏεΛఏڙ༧ఆLINE CLOVAVoice$-07"5FYU"OBMZUJDTςΩετղੳɺײੳٕज़ɻԻೝࣝͰىͨ͜͠ςΩετ͔ΒͷݕࡧײੳͳͲʹ׆༻ɻLINE CLOVAText Analytics$-07"7JTJPOମೝࣝɺը૾ೝٕࣝज़ɻLINEγϣοϐϯάͷʮSHOPPING LENSʯͰ׆༻ɻLINE CLOVAVision$-07"'BDFߴਫ਼ͷإೝٕࣝज़ɻeKYCʢΦϯϥΠϯຊਓ֬ೝʣإೝূʹΑΔडͳͲʹ׆༻ɻLINE CLOVAFace-*/&$-07"ͷ ϓϩμΫτ4BB4ఏڙ4BB4ఏڙ-*/&ͷ࣋ͭଟ༷ͳ"*ཁૉٕज़Λجʹ෯͍##͚ϓϩμΫτΛల։͍ͯ͠·͢ʢҰ෦4BB4ͱͯ͠ఏڙʣɻ
"*Χϯύχʔ͕ఏڙ͍ͯ͠ΔαʔϏε
"*Χϯύχʔ ͷ 3%7JTJPO$POTFSWBUJWF%JTSVQUJWF5JNF *OUFSBDUJWFWJSUVBMFYQFSJFODF"VUPOPNPVT"*XPSLGMPX%JHJUBM.F.F"7"5"3%JHJUBM*EFOUJUZ#FUUFS$BSF5SVTUXPSUIZ"*"*'BJSOFTT&YQMBJOBCMF"*%BSL%BUB0NOJQPUFOU"*(JHBOUJD-BOHVBHFNPEFM6OMBCFMFE%BUB%BUB.BSLFUQMBDF(FOFSBUJWF*OUFMMJHFODF/FX&EVDBUJPO%FQFOEBCMF4551SJWBDZQSFTFSWJOH4FBN%JTDSJNJOBUPS
ʣ*$"441 ʣ*/5&341&&$) ʣ8"41"" ʣ#JH%BUB ʢʣ*$"441 &64*1$0 */5&341&&$) %$"4& "14*1" $713 51%1 '-*$.- -%3$ *$"441 *$3" *6* *$%& *$$7 各分野最⾼峰の会議で認められるAI 基礎研究成果͜Ε·ͰͷՌ
⾃由度が⾼い発話のリアルタイム認識で⾃然な会話の書き起こしを実現!։ൃதͷٕज़ 4QFFDICLOVA note
%// Ի߹ʙײΛॊೈʹ੍ޚՄೳͳԻ߹Λ࣮ݱʙCOntrollable, High-quality, And expRessIve TTS明るさ暗さ😀 😄🙂😐😢😰 😥։ൃதͷٕज़ʢ7PJDFʣ
HyperCLOVA1750億超のパラメータを有する汎⽤⾔語モデルを開発։ൃதͷٕज़ʢ/-1ʣ
国会図書館デジタルアーカイブ プロジェクト247万点2.23億ページ超のデジタル・アーカイブ化։ൃதͷٕज़ʢ$7-ʣhttps://linecorp.com/ja/pr/news/ja/2021/3825
ੜ׆ϏδωεʹજΉΘ͠͞Λղܾ͜͠Ε͔Βͷ͋ͨΓ·͑ΛΓग़͢ʂҰྲྀʹͩ͜ΘΔΠϯλʔφγϣφϧͳνʔϜ
Our challengeInnovation by mixing LINE AI assets, especially NLP, voice/speech, and CV.JYFE-*/&"*.J-"*.VMUJNPEBMJOQVUPVUQVU
44STRICTLY CONFIDENTIAL$-07"0$3Point 1 ੈք࠷ߴਫ४ͷ"*0$3Point 2 ͋ΒΏΔॻྨը૾Λૉૣ͘ςΩετԽPoint 3 खॻ͖ͷจࣈೝࣝՄೳԣॻ͖ॎॻ͖ɺؙ͘ۂͨ͠จࣈͳͲѱ݅ԼͰͷಡΈऔΓɺଟݴޠͷೝࣝɺઐ༻ޠͷೝࣝͳͲͰߴ͍ਫ਼ͱධՁɻจॻղੳͱೝࣝʹؔ͢Δࠃࡍձٞ *$%"3ͷʹͯੈք/PΛ֫ಘ͍ͯ͠·͢ɻϑΥʔϚοτ͕ܾ·͍ͬͯΔॻྨͪΖΜɺ͋ΒΏΔελΠϧͷॻྨΛਖ਼͘͠ςΩετԽ͠·͢ɻ$-07"0$3ʢྖऩॻɾٻॻɾϨγʔτಛԽܕʣͰɺϑΥʔϚοτͷࣄલొ͕ෆཁɻखॻ͖จࣈɺࣼΊʹͳͬͨจࣈߴਫ਼ͷೝূ͕Մೳ
45STRICTLY CONFIDENTIAL4"1$PVODVS +BQBOࢴͷٻॻͷσδλϧԽͷύʔτφʔͱͯ͠-*/&$-07"Λબఆגࣜձࣾതใಊ%:ϝσΟΞύʔτφʔζγϦΞϧφϯόʔΛಡΈऔΔ͜ͱͰɺίϯϏχԁ͘͡ΛΦϯϥΠϯԽ-*/&τʔΫϧʔϜτʔΫϧʔϜ͔Βը૾ΛࡱΔ͚ͩͰจࣈೝࣝػೳ͕ར༻ՄೳΫϥυαʔϏεͱͷύʔτφʔγοϓ৽ͨͳιϦϡʔγϣϯͱͯ͠ͷ׆༻-*/&αʔϏεͷߩݙʘ GPS*/70*$&ʗ-*/&Ϩγʔτ -*/&1-"$&ϨγʔτΛ"*ͰಡΈऔΔ͜ͱͰɺֹۚͱ͕ࣗಈͰྨɻࢧग़ཧར༻͓ͨ͠ళͷޱίϛαΠτͷߘͳͲ͕؆୯ʹɻ$-07"0$3ಋೖࣄྫ
46STRICTLY CONFIDENTIAL$POGJEFOUJBM-*/&"J$BMMPoint 1 ϢʔβʔΛͨͤͳ͍ར༻ମݧPoint 2 ਓؒຯ͋;ΕΔࣗવͳରPoint 3 طଘγεςϜ-*/&ͱͷ࿈ܞ࣌ؒɺडిମ੍Λ༻ҙͰ͖Δ͜ͱͪΖΜɺ൪߸ೖྗͰରԠ༰ΛৼΓ͚Δ*73ʢ*OUFSBDUJWF7PJDF3FTQPOTFʣͱҟͳΓɺॊೈʹରԠ͠·͢ɻ༲ͷ͋Δਓؒʹ͍ۙࣗવͳԻͰɺϢʔβʔʹετϨεΛֻ͚·ͤΜɻ·ͨɺ"*ʹΑΔԻೝࣝͷֶशʹΑΓɺԻೝࣝͱରͷਫ਼্͕͠ɺରԠ্࣭͕͠·͢ɻ͜Ε·ͰՍిडిޙʹߦ͍ͬͯͨΞφϩάͳσʔλඋۀɺγεςϜ࿈ܞʹΑΓܰݮ͠·͢ɻ·ͨɺ-*/&4.4ͱ࿈ܞ͢Δ͜ͱͰɺ௨ޙʹϢʔβʔʹࣗಈͰϝοηʔδΛૹ৴͢Δ͜ͱՄೳͰ͢ɻ
47STRICTLY CONFIDENTIALϠϚτӡ༌גࣜձࣾސ٬͔ΒͷిʹΑΔूՙडਆಸݝ৽ܕίϩφి૬ஊ૭ޱגࣜձࣾΤϏιϧҿ৯ళ͚༧ཧγεςϜגࣜձࣾΧʔϑϩϯςΟΞΧʔϝϯςφϯε༧αʔϏεେखاۀͷۀʹಋೖެڞߦͷෛ୲ܰݮϓϥοτϑΥʔϜͱͷػೳ࿈ܞ-*/&"J$BMM ಋೖ࣮
48STRICTLY CONFIDENTIAL-*/&F,:$Point ߴਫ਼ͷΦϯϥΠϯຊਓ֬ೝ-*/& F,:$ɺ-*/&͕։ൃͨ͠"*ٕज़ΛΈ߹Θͤɺ҆શੑͱརศੑΛཱ྆ͨ͠ɺΦϯϥΠϯ্Ͱͷຊਓ֬ೝΛ݁͢ΔιϦϡʔγϣϯͰ͢ɻ"1*4%,ͳͲ๛ͳఏڙํ๏ʹΑΓɺར༻తʹ͋ͬͨΧελϚΠζ͕ՄೳͰ͢ɻखଓ͖ͷ؆ུԽʹΑͬͯɺۀޮԽɾϢʔβʔͷརศੑఏڙɺͳΓ͢·͠ʹΑΔෆਖ਼ΞΫηεɾෆਖ਼ར༻ͷࢭΛ࣮ݱ͠·͢ɻ
49STRICTLY CONFIDENTIAL-*/&1BZεϚϗͱূ͕͋ΕͰ͖ΔʮεϚϗͰ͔ΜͨΜຊਓ֬ೝʯ-*/&1BZͰͷಋೖ-*/&F,:$ ಋೖࣄྫ