Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
WWDC2025セッション共有: Visionフレームワークによるドキュメントの読み込み
Search
ni_san2000
June 26, 2025
Technology
0
73
WWDC2025セッション共有: Visionフレームワークによるドキュメントの読み込み
in Swift愛好会スピンオフ WWDC25セッション要約会@DeNA (2025/06/26)
ni_san2000
June 26, 2025
Tweet
Share
More Decks by ni_san2000
See All by ni_san2000
Appleの“ホーム”アプリを使いたい! 2024年の国内スマートホーム事情
ryosism
0
110
初めて参加したiOSDCは○○のようだった!
ryosism
0
1.9k
Other Decks in Technology
See All in Technology
OPENLOGI Company Profile
hr01
0
67k
Lambda Web Adapterについて自分なりに理解してみた
smt7174
5
140
登壇ネタの見つけ方 / How to find talk topics
pinkumohikan
6
600
生成AIで小説を書くためにプロンプトの制約や原則について学ぶ / prompt-engineering-for-ai-fiction
nwiizo
4
3.7k
2025-06-26_Lightning_Talk_for_Lightning_Talks
_hashimo2
2
120
プロダクトエンジニアリング組織への歩み、その現在地 / Our journey to becoming a product engineering organization
hiro_torii
0
140
Liquid Glass革新とSwiftUI/UIKit進化
fumiyasac0921
0
300
AWS Organizations 新機能!マルチパーティ承認の紹介
yhana
1
230
fukabori.fm 出張版: 売上高617億円と高稼働率を陰で支えた社内ツール開発のあれこれ話 / 20250704 Yoshimasa Iwase & Tomoo Morikawa
shift_evolve
PRO
1
450
Witchcraft for Memory
pocke
1
670
Should Our Project Join the CNCF? (Japanese Recap)
whywaita
PRO
0
300
LangChain Interrupt & LangChain Ambassadors meetingレポート
os1ma
2
230
Featured
See All Featured
Imperfection Machines: The Place of Print at Facebook
scottboms
267
13k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
281
13k
Git: the NoSQL Database
bkeepers
PRO
430
65k
Being A Developer After 40
akosma
90
590k
Practical Orchestrator
shlominoach
188
11k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
48
2.9k
VelocityConf: Rendering Performance Case Studies
addyosmani
331
24k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
32
2.4k
The Invisible Side of Design
smashingmag
300
51k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
8
680
A better future with KSS
kneath
239
17k
Embracing the Ebb and Flow
colly
86
4.7k
Transcript
WWDC2025ηογϣϯڞ༗: VisionϑϨʔϜϫʔΫʹΑΔυΩϡϝϯτͷ ಡΈࠐΈ in SwiftѪձεϐϯΦϑ WWDC25ηογϣϯཁձ@DeNA (2025/06/26) ʹʔ͞Μ(@ni_san2000)
ࣗݾհ • ʹʔ͞Μ(@ni_san2000) • ۀͰiOSΞϓϦΛ࡞͍ͬͯ·͢(3) • Apple৴ऀͳͷͰجௐߨԋੲ͔ΒݟͯΔ • macOSͷωʔϛϯάετʔϦʔ •
ΫϨΠάɾϑΣσϦΪͷύϧΫʔϧ • झຯ • Χϝϥ / ؍༿২ / ίʔώʔ / eεϙʔπ • ࠷ۙࣸਅίϯςετͱ͔ڵຯ͋Γ·͢ 2
͢ηογϣϯʹ͍ͭͯ • VisionϑϨʔϜϫʔΫʹΑΔυΩϡϝϯτͷಡΈࠐΈ • “Foundation ModelsͰͳ͍”ɺAIؔ࿈ͷ͓Ͱ͢ 3 ႄႨჭğIUUQTEFWFMPQFSBQQMFDPNKQWJEFPTQMBZXXED
VisionϑϨʔϜϫʔΫʹ͍͓ͭͯ͞Β͍ • ಈը૾ʹಛԽͨ͠AIϞσϧͷػೳΛఏڙ͢ΔϑϨʔϜϫʔΫ • ը૾ೝࣝ, ମݕग़, إೝࣝͳͲ • .mlmodelΛཁ͢ΔػೳΛखܰʹAPIͱͯ͠ར༻Ͱ͖Δ •
iOS18Ͱ31ݸͷAPIΛఏڙ͍ͯͨ͠ 4 ႄႨჭğIUUQTEFWFMPQFSBQQMFDPNKQNBDIJOFMFBSOJOHNPEFMT
VisionϑϨʔϜϫʔΫʹ͍͓ͭͯ͞Β͍ CalculateImageAestheticsScoresRequest ClassifyImageRequest CoreMLRequest DetectAnimalBodyPoseRequest DetectBarcodesRequest DetectContoursRequest DetectDocumentSegmentationRequest DetectFaceCaptureQualityRequest DetectFaceLandmarksRequest
DetectFaceRectanglesRequest DetectHorizonRequest DetectHumanBodyPose3DRequest DetectHumanBodyPoseRequest DetectHumanHandPoseRequest DetectHumanRectanglesRequest DetectRectanglesRequest DetectTextRectanglesRequest DetectTrajectoriesRequest GenerateAttentionBasedSaliencyIma GenerateForegroundInstanceMaskRequest GenerateImageFeaturePrintRequest GenerateOpticalFlowRequest GeneratePersonInstanceMaskRequest GeneratePersonSegmentationRequest RecognizeAnimalsRequest RecognizeTextRequest TrackHomographicImageRegistration TrackObjectRequest TrackOpticalFlowRequest TrackRectangleRequest TrackTranslationalImageRegistrationRequest 5 RecognizeDocumentRequest /FX DetectLensSmudgeRequest /FX
͜Ε·ͰͷυΩϡϝϯτೝࣝ • ݟ͍͑ͯΔจࣈೝࣝͰ͖Δ • ͚ͲɺจষߏදͷίϯςΩετࣦΘΕΔ 6 ႄႨჭğIUUQTEFWFMPQFSBQQMFDPNKQWJEFPTQMBZXXED
xOS26͔ΒͰ͖ΔΑ͏ʹͳΔ͜ͱ • υΩϡϝϯτ͔ΒจॻͷߏΛཧղͯ͠ཁૉͷೝ͕ࣝͰ͖Δ • ஈམɾՕॻ͖දͷίϯςΩετೝࣝ • ಛఆͷϑΥʔϚοτʹଈͨ͠ςΩετͷࣝผ • QRίʔυͷऔΓग़͠ 7
υΩϡϝϯτ͔ΒจॻͷߏΛཧղͯ͠ཁૉͷೝ͕ࣝͰ͖Δ ႄႨჭğIUUQTEFWFMPQFSBQQMFDPNKQWJEFPTQMBZXXED
ᶃ RequestΫϥεͷ༻ҙ (RecognizeDocumentRequest) ᶄ ը૾Λ͢ ᶅ DocumentObservationΛड͚औΔ ᶆ document͔Β֤ཁૉΛऔಘ ᶇ
That’s it !
DocumentObservationͷݕग़ߏ • ContainerList, Table, Text, BarcodesͳͲͷཁૉΛ࣋ͭ • $FMM*UFNT$POUBJOFSͷཁૉ͍࣋ͬͯΔͷͰɺ͞ΒʹแͰ͖Δ 10 ႄႨჭğIUUQTEFWFMPQFSBQQMFDPNKQWJEFPTQMBZXXED
☝
• ࣝผϑΥʔϚοτ10छྨ • ຊޠ·ͩNot supported • දࣔ͢ΔࡍʹText AttributeΛࢦఆ͢Δඞཁ͕ͳ͘ͳΓͦ͏ ςΩετΛࣝผ͢ΔDataDetector 11
આ໌ 5ZQF ิ ΧϨϯμʔ CalendarEvent ϝʔϧΞυϨε EmailAddress ϑϥΠτ൪߸ FlightNumber 63- Link ଌఆ ୯Ґ͖ Measurement %JNFOTJPOͰදݱͰ͖Δ୯Ґ ֹۚ ௨՟͖ MoneyAmount -PDBMF$VSSFODZͰදݱͰ͖Δ୯Ґ ࢧ͍ঢ়گ PaymentIdentifier 6OJ fi FE1BZNFOUT*OUFSGBDF 61* ి൪߸ PhoneNumber ॅॴ PostalAddress FHTUSFFU DJUZ TUBUF ૹ൪߸ ShipmentTrackingNumber ႄႨჭğIUUQTEFWFMPQFSBQQMFDPNKQWJEFPTQMBZXXED
֤ϑΥʔϚοτͱͯ͠σʔλΛऔಘ • switch-case͢Δ͚ͩ • ͲͷϑΥʔϚοτ͔ςΩετ୯ҐͰࣝผࡁΈ 12
υΩϡϝϯτ͔ΒจॻͷߏΛཧղͯ͠ཁૉͷೝ͕ࣝͰ͖Δ ႄႨჭğIUUQTEFWFMPQFSBQQMFDPNKQWJEFPTQMBZXXED
• VisionϑϨʔϜϫʔΫͷ৽نAPI • υΩϡϝϯτͷ༰ΛߏతʹੳͰ͖Δ • ContainerTable, List, Cell, TextͳͲͷཁૉΛ͍࣋ͬͯΔ •
CellListͷItemೖΕࢠͰContainerΛ֨ೲͰ͖Δ • Text֨ೲ͞Εͨจࣈ͔ΒϑΥʔϚοτΛೝࣝͰ͖Δ • DataDetector.Match.SemanticDetails.XXXXXʹଟͷ৽نσʔλߏ RecognizeDocumentRequest·ͱΊ 14
ΓͷVisionؔ࿈ͷ৽ཁૉ̎ͭ ͓·͚ 15
• DetectLensSmudgeRequest / SmudgeObservation͕Ճ • ೖྗͨ͠ը૾͕ԚΕ͍ͯΔ͔Λผ͢Δ • ᮢΛઃ͚Δ͜ͱͰੳΤϥʔΛະવʹ͛Δ Ϩϯζද໘ͷԚΕಶΓΛݕग़͢ΔObservation 16
ႄႨჭğIUUQTEFWFMPQFSBQQMFDPNKQWJEFPTQMBZXXED
• खͷؔઅҐஔΛݕग़͢ΔϞσϧ͕ߋ৽͞Εͨ • ਪͷਫ਼ͱ্͕ • WWDC2021Ͱൃද͞ΕͨϞσϧ͔ΒΞοϓσʔτ HandPose DetectionͷੳϞσϧߋ৽ 17 ႄႨჭğIUUQTEFWFMPQFSBQQMFDPNKQWJEFPTQMBZXXED
• VisionϑϨʔϜϫʔΫͷػೳՃ • υΩϡϝϯτͷ༰ΛߏతʹੳͰ͖Δ • ςΩετͷ༰ΛྨͰ͖Δ • ͦͷଞʹ2ͭͷVisionϑϨʔϜϫʔΫͷΞοϓσʔτ • DetectLensSmudgeRequest
/ SmudgeObservation • HandPose DetectionͷੳϞσϧߋ৽ ͓͠·͍ 18