Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
20170829_iOSLT_機械学習とVision.framework
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
shtnkgm
September 20, 2017
Programming
100
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
20170829_iOSLT_機械学習とVision.framework
機械学習の基礎的な内容を交えつつ、iOS11で追加されたVision.frameworkの説明とデモ
shtnkgm
September 20, 2017
More Decks by shtnkgm
See All by shtnkgm
Combine入門
shtnkgm
2
320
Property Wrappers
shtnkgm
0
370
Saliency Detection
shtnkgm
0
86
パフォーマンス改善とユニットテスト
shtnkgm
4
1.7k
iOSのコードベースレイアウト
shtnkgm
2
830
20190117_iOSLT_CBLinSwift.pdf
shtnkgm
0
120
SwiftとFunctional Reactive Programming
shtnkgm
0
200
20180710_iOSLT_iOSでDarkModeを実装する
shtnkgm
0
120
20180410_iOSLT_SwiftとProtocol-OrientedProgramming
shtnkgm
0
140
Other Decks in Programming
See All in Programming
Skillsは効率化、Agentsは"自分の拡張"——Builder時代のエージェント編成(CC Night 2026)
wemra
1
130
Signal Forms: Details & Live Coding @enterJS 2026 in Mannheim
manfredsteyer
PRO
0
130
PHPで使える日時の表現と、その知り方 #frontend_phpcon_do
o0h
PRO
0
240
Vue × Nuxt × Oxc どこまで使える?実運用の現在地
andpad
0
250
Lemonade + Foundry Toolkit でお手軽アプリ開発
seosoft
1
330
Hunting Vulnerabilities in Symfony with LLMs
vinceamstoutz
0
540
スマートグラスで並列バイブコーディング
hyshu
0
140
AIとASP.NET Coreで雑Webアプリを作った話
mayuki
0
610
Datadog × OpenTelemetry 入門と実践のあいだ
kn_to_maxpno
1
160
エージェンティックRAGにAWSで入門しよう!
har1101
8
1.5k
「エンジニアインターン、どうやって取った?」準備のリアルを語るLT会 Progate BAR
akiomatic
0
130
生成AI時代にこそ効くGo | Why Go Works in the Age of Generative AI
mom0tomo
8
3.2k
Featured
See All Featured
Building Better People: How to give real-time feedback that sticks.
wjessup
370
20k
Into the Great Unknown - MozCon
thekraken
41
2.6k
How To Stay Up To Date on Web Technology
chriscoyier
790
250k
The SEO identity crisis: Don't let AI make you average
varn
0
490
Docker and Python
trallard
47
3.9k
Future Trends and Review - Lecture 12 - Web Technologies (1019888BNR)
signer
PRO
0
3.6k
How to Align SEO within the Product Triangle To Get Buy-In & Support - #RIMC
aleyda
2
1.5k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
47
8.2k
30 Presentation Tips
portentint
PRO
1
320
4 Signs Your Business is Dying
shpigford
187
22k
Leading Effective Engineering Teams in the AI Era
addyosmani
9
2k
Primal Persuasion: How to Engage the Brain for Learning That Lasts
tmiket
0
370
Transcript
ػցֶशͱVision.framework Shota Nakagami / @shtnkgm 2017/8/29
͢༰ — Vision.frameworkͷجຊతͳઆ໌ — ػցֶशͷ֓ཁ — VisionΛ༻͍ͨΧϝϥը૾Λผ͢ΔαϯϓϧΞϓϦ
Vision.frameworkͱ — iOS11͔ΒՃ͞Εͨը૾ೝࣝAPIΛఏڙ͢ΔϑϨʔϜϫʔ Ϋ — ಉ͘͡iOS11͔ΒՃ͞ΕͨػցֶशϑϨʔϜϫʔΫͷCore MLΛநԽ
ػցֶशελοΫ
χϡʔϥϧωοτϫʔΫͱ — ػցֶशख๏ͷҰछ — ਓؒͷͷਆܦճ࿏Λ ࣜϞσϧͰදͨ͠ͷ — NNͱུ͞ΕΔ ʢDNN1ɺRNN2ɺCNN3ͳͲʣ 3
Convolutional Neural NetworkʢΈࠐΈχϡʔϥϧωο τϫʔΫʣ 2 Recurrent Neural Networkʢ࠶ؼܕχϡʔϥϧωοτϫ ʔΫʣ 1 Deep Neural NetworkʢσΟʔϓχϡʔϥϧωοτϫʔ Ϋʣ
VisionͰೝࣝͰ͖Δͷ
VisionͰೝࣝͰ͖Δͷᶃ — إݕग़ / Face Detection and Recognition — όʔίʔυݕग़
/ Barcode Detection — ը૾ͷҐஔ߹Θͤ / Image Alignment Analysis — ςΩετݕग़ / Text Detection — ਫฏઢݕग़ / Horizon Detection
VisionͰೝࣝͰ͖Δͷᶄ ػցֶशϞσϧͷ༻ҙ͕ඞཁͳͷ — ΦϒδΣΫτݕग़ͱτϥοΩϯά / Object Detection and Tracking —
ػցֶशʹΑΔը૾ੳ / Machine Learning Image Analysis
Χϝϥը૾Λผ͢ΔαϯϓϧΞ ϓϦΛͭ͘Δ
αϯϓϧΞϓϦ֓ཁ — VisionͷʮػցֶशʹΑΔը૾ੳʯػೳΛར༻ — ΧϝϥͰөͨ͠ը૾Λผ͠ɺϞϊͷ໊લΛग़ྗ
ػցֶशʹΑΔը૾ೝࣝͷྲྀΕ 1. ֶशͷͨΊը૾σʔλΛऩूʢڭࡐΛूΊΔʣ 2. ֶश༻σʔλ͔ΒɺػցֶशΞϧΰϦζϜʹΑΓϞσϧΛ࡞ ※Ϟσϧɾɾɾ͑Λग़ͯ͘͠ΕΔϩδοΫ ྨɿ͜ͷը૾ݘʁೣʁ ճؼɿ༧ଌʢ໌ͷגՁʁʣ 3.
ֶशࡁΈϞσϧΛ༻͍ͯະͷը૾Λผʢ࣮ફʣ
Ϟσϧ࡞ׂѪ — ֶशσʔλͷऩूɾܗׂΓͱେม — ͦΕͳΓͷϚγϯεϖοΫɺܭࢉ͕࣌ؒඞཁ — ػցֶशʹؔ͢Δ͕ࣝඞཁ
Ϟσϧͷ༻ҙ ؆୯ͷͨΊɺֶशࡁΈϞσϧΛར༻ AppleͷαΠτͰ͞Ε͍ͯΔʢ.mlmodelܗࣜʣ https://developer.apple.com/machine-learning/
ϞσϧҰཡ ϞσϧʹΑͬͯಘҙͳը૾ͷछྨ༰ྔ͕ҟͳΔ ʢ5MBʙ553.5MBʣ — MobileNets — SqueezeNet — Places205-GoogLeNet —
ResNet50 — Inception v3 — VGG16
ࠓճResNet50Λར༻ — थɺಈɺ৯ɺΓɺਓͳͲͷ1000छྨͷΧςΰϦ — αΠζ102.6 MB — MITϥΠηϯε
ϞσϧΛϓϩδΣΫτʹࠐΉ
Xcodeʹυϥοά&υϩοϓ
ϞσϧΫϥε͕ࣗಈੜ͞ΕΔ ࣗಈͰϞσϧ໊.swiftͱ͍͏໊લͰϞσϧΫϥε͕࡞͞ΕΔ ྫ) Resnet50.swiftʢҰ෦ൈਮʣ
Χϝϥը૾ͷΩϟϓνϟॲཧ
private func startCapture() { let captureSession = AVCaptureSession() captureSession.sessionPreset =
AVCaptureSessionPresetPhoto // ೖྗͷࢦఆ let captureDevice = AVCaptureDevice.defaultDevice(withMediaType: AVMediaTypeVideo) guard let input = try? AVCaptureDeviceInput(device: captureDevice) else { return } guard captureSession.canAddInput(input) else { return } captureSession.addInput(input) // ग़ྗͷࢦఆ let output: AVCaptureVideoDataOutput = AVCaptureVideoDataOutput() output.setSampleBufferDelegate(self, queue: DispatchQueue(label: "VideoQueue")) guard captureSession.canAddOutput(output) else { return } captureSession.addOutput(output) // ϓϨϏϡʔͷࢦఆ guard let previewLayer = AVCaptureVideoPreviewLayer(session: captureSession) else { return } previewLayer.videoGravity = AVLayerVideoGravityResizeAspectFill previewLayer.frame = view.bounds view.layer.insertSublayer(previewLayer, at: 0) // Ωϟϓνϟ։࢝ captureSession.startRunning() }
ࡱӨϑϨʔϜຖʹݺΕΔDeleate extension ViewController: AVCaptureVideoDataOutputSampleBufferDelegate { func captureOutput(_ output: AVCaptureOutput!, didOutputSampleBuffer
sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!) { // CMSampleBufferΛCVPixelBufferʹม guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return } // ͜ͷதʹVision.frameworkͷॲཧΛॻ͍͍ͯ͘ʢը૾ೝࣝ෦ʣ } }
ը૾ೝࣝ෦ͷॲཧ
VisionͰར༻͢ΔओͳΫϥε — VNCoreMLModel — VNCoreMLRequest — VNImageRequestHandler — VNObservation
VNCoreMLModel — CoreMLͷϞσϧΛVisionͰѻ͏ͨΊͷίϯςφΫϥε
VNCoreMLRequest — CoreMLʹը૾ೝࣝΛཁٻ͢ΔͨΊͷΫϥε — ೝࣝ݁ՌϞσϧͷग़ྗܗࣜʹΑΓܾ·Δ — ը૾→Ϋϥεʢྨ݁Ռʣ — ը૾→ಛྔ —
ը૾→ը૾
VNImageRequestHandler — Ұͭͷը૾ʹର͠ɺҰͭҎ্ͷը૾ೝࣝॲཧ ʢVNCoreMLRequestʣΛ࣮ߦ͢ΔͨΊͷΫϥε — ॳظԽ࣌ʹೝࣝରͷը૾ܗࣜΛࢦఆ͢Δ — CVPixelBuffer — CIImage
— CGImage
VNObservation — ը૾ೝࣝ݁ՌͷநΫϥε — ݁Ռͱͯ͜͠ͷΫϥεͷαϒΫϥεͷ͍ͣΕ͔͕ฦ͞ΕΔ — ೝࣝͷ֬৴Λද͢confidenceϓϩύςΟΛ࣋ͭ ʢVNConfidence=FloatͷΤΠϦΞεʣ
VNObservationαϒΫϥε — VNClassificationObservation ྨ໊ͱͯ͠identifierϓϩύςΟΛ࣋ͭ — VNCoreMLFeatureValueObservation ಛྔσʔλͱͯ͠featureValueϓϩύςΟΛ࣋ͭ — VNPixelBufferObservation ը૾σʔλͱͯ͠pixelBufferϓϩύςΟΛ࣋ͭ
·ͱΊΔͱ… — VNCoreMLModelʢΈࠐΜͩϞσϧʣ — VNCoreMLRequestʢը૾ೝࣝͷϦΫΤετʣ — VNImageRequestHandlerʢϦΫΤετͷ࣮ߦʣ — VNObservationʢೝࣝ݁Ռʣ
۩ମతͳ࣮ίʔυ
ϞσϧΫϥεͷॳظԽ // CoreMLͷϞσϧΫϥεͷॳظԽ guard let model = try? VNCoreMLModel(for: Resnet50().model)
else { return }
ը૾ೝࣝϦΫΤετΛ࡞ // ը૾ೝࣝϦΫΤετΛ࡞ʢҾϞσϧͱϋϯυϥʣ let request = VNCoreMLRequest(model: model) { [weak
self] (request: VNRequest, error: Error?) in guard let results = request.results as? [VNClassificationObservation] else { return } // ผ݁Ռͱͦͷ֬৴Λ্Ґ3݅·Ͱදࣔ // identifierΧϯϚ۠ΓͰෳॻ͔Ε͍ͯΔ͜ͱ͕͋ΔͷͰɺ࠷ॳͷ୯ޠͷΈऔಘ͢Δ let displayText = results.prefix(3) .flatMap { "\(Int($0.confidence * 100))% \($0.identifier.components(separatedBy: ", ")[0])" } .joined(separator: "\n") DispatchQueue.main.async { self?.textView.text = displayText } }
ը૾ೝࣝϦΫΤετΛ࣮ߦ // CVPixelBufferʹର͠ɺը૾ೝࣝϦΫΤετΛ࣮ߦ try? VNImageRequestHandler(cvPixelBuffer: pixelBuffer, options: [:]).perform([request])
ը૾ೝࣝ෦ͷܗ guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
guard let model = try? VNCoreMLModel(for: Resnet50().model) else { return } let request = VNCoreMLRequest(model: model) { [weak self] (request: VNRequest, error: Error?) in guard let results = request.results as? [VNClassificationObservation] else { return } let displayText = results.prefix(3) .flatMap { "\(Int($0.confidence * 100))% \($0.identifier.components(separatedBy: ", ")[0])" } .joined(separator: "\n") DispatchQueue.main.async { self?.textView.text = displayText } } try? VNImageRequestHandler(cvPixelBuffer: pixelBuffer, options: [:]).perform([request])
σϞಈը
None
tabbyͬͯԿʁ
tabby = τϥωίʂ τϥωίͱɺτϥͷΑ͏ͳࣶ༷Λ࣋ͭωίͷ͜ͱͰ͋ΔɻλϏʔͱݺΕΔɻτϥೣ ετϥΠϓͷଞʹɺ్ࣶ༷͕Εͯɺൗ༷ɺᤳᤶ൝ɺࡉ͔ࣶ༷͘Λ్Εͤͯͨ͞ ͷ͕͋Γɺଟ༷Ͱ͋ΔɻʢҾ༻: ΟΩϖσΟΞʣ
·ͱΊ
— ֶशࡁΈϞσϧ͕͋Εɺ࣮ࣗମ؆୯! — ωίͷछྨڭ͑ͯ͘ΕΔ" — ͋ͱϞσϧࣗͰ࡞ΕΔΑ͏ʹͳΕͬͱ෯͕͕ Δ
ॳΓ͔ͨͬͨ͜ͱ — ΠϯελάϥϜ༻ͷࣗಈϋογϡλά͚ΞϓϦ — ϋογϡλάΛ͢ΩϟϓγϣϯAPIطʹഇࢭʘ(^o^)ʗ
αϯϓϧίʔυ ࠓճ͝հͨ͠αϯϓϧίʔυͪ͜Βʹஔ͍ͯ͋Γ·͢ɻ https://github.com/shtnkgm/VisionFrameworkSample ※εΫϦʔϯγϣοτͷެ։ʹNDAҙ
͓ΘΓ
ࢀߟࢿྉᶃ — Build more intelligent apps with machine learning. /
Apple — Vision / Apple Developer Documentation — ʲWWDC2017ʳVision.framework ͷςΩετݕग़Λࢼ͠ ͯΈ·ͨ͠ʲiOS11ʳ — Keras + iOS11 CoreML + Vision Framework ʹΑΔɺ ΫϩإࣝผΞϓϦͷ։ൃ — [Core ML] .mlmodel ϑΝΠϧΛ࡞͢Δ / ϑΣϯϦϧ
ࢀߟࢿྉᶄ — [iOS 11] CoreMLͰը૾ͷࣝผΛࢼͯ͠Έ·ͨ͠ ʢVision.FrameworkΛΘͳ͍ύλʔϯʣ #WWDC2017 — Places205-GoogLeNetͰॴͷఆ /
fabo.io — iOSDCͷϦδΣΫτίϯͰʰiOSͱσΟʔϓϥʔχϯάʱʹ ͍ͭͯ͠·ͨ͠Add Star — [iOS 10][χϡʔϥϧωοτϫʔΫ] OSSͰAccelerateʹՃ ͞ΕͨBNNSΛཧղ͢Δ ~XORฤ~