Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
20170829_iOSLT_機械学習とVision.framework
Search
shtnkgm
September 20, 2017
Programming
0
88
20170829_iOSLT_機械学習とVision.framework
機械学習の基礎的な内容を交えつつ、iOS11で追加されたVision.frameworkの説明とデモ
shtnkgm
September 20, 2017
Tweet
Share
More Decks by shtnkgm
See All by shtnkgm
Combine入門
shtnkgm
2
280
Property Wrappers
shtnkgm
0
340
Saliency Detection
shtnkgm
0
45
パフォーマンス改善とユニットテスト
shtnkgm
4
1.6k
iOSのコードベースレイアウト
shtnkgm
2
770
20190117_iOSLT_CBLinSwift.pdf
shtnkgm
0
88
SwiftとFunctional Reactive Programming
shtnkgm
0
170
20180710_iOSLT_iOSでDarkModeを実装する
shtnkgm
0
88
20180410_iOSLT_SwiftとProtocol-OrientedProgramming
shtnkgm
0
110
Other Decks in Programming
See All in Programming
KessokuでDIでもgoroutineを活用する / Go Connect #6
mazrean
0
120
ワープロって実は計算機で
pepepper
2
1.4k
🔨 小さなビルドシステムを作る
momeemt
2
550
Rancher と Terraform
fufuhu
0
110
MLH State of the League: 2026 Season
theycallmeswift
0
160
20250808_AIAgent勉強会_ClaudeCodeデータ分析の実運用〜競馬を題材に回収率100%の先を目指すメソッドとは〜
kkakeru
0
210
STUNMESH-go: Wireguard NAT穿隧工具的源起與介紹
tjjh89017
0
390
RDoc meets YARD
okuramasafumi
3
140
書き捨てではなく継続開発可能なコードをAIコーディングエージェントで書くために意識していること
shuyakinjo
1
310
Claude Codeで実装以外の開発フロー、どこまで自動化できるか?失敗と成功
ndadayo
2
1.5k
Jakarta EE Core Profile and Helidon - Speed, Simplicity, and AI Integration
ivargrimstad
0
210
Namespace and Its Future
tagomoris
6
510
Featured
See All Featured
How to Think Like a Performance Engineer
csswizardry
25
1.8k
Optimising Largest Contentful Paint
csswizardry
37
3.4k
Bootstrapping a Software Product
garrettdimon
PRO
307
110k
GraphQLの誤解/rethinking-graphql
sonatard
71
11k
[RailsConf 2023] Rails as a piece of cake
palkan
56
5.8k
Building an army of robots
kneath
306
46k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
29
1.8k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
656
61k
The Straight Up "How To Draw Better" Workshop
denniskardys
236
140k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
Visualization
eitanlees
147
16k
Raft: Consensus for Rubyists
vanstee
140
7.1k
Transcript
ػցֶशͱVision.framework Shota Nakagami / @shtnkgm 2017/8/29
͢༰ — Vision.frameworkͷجຊతͳઆ໌ — ػցֶशͷ֓ཁ — VisionΛ༻͍ͨΧϝϥը૾Λผ͢ΔαϯϓϧΞϓϦ
Vision.frameworkͱ — iOS11͔ΒՃ͞Εͨը૾ೝࣝAPIΛఏڙ͢ΔϑϨʔϜϫʔ Ϋ — ಉ͘͡iOS11͔ΒՃ͞ΕͨػցֶशϑϨʔϜϫʔΫͷCore MLΛநԽ
ػցֶशελοΫ
χϡʔϥϧωοτϫʔΫͱ — ػցֶशख๏ͷҰछ — ਓؒͷͷਆܦճ࿏Λ ࣜϞσϧͰදͨ͠ͷ — NNͱུ͞ΕΔ ʢDNN1ɺRNN2ɺCNN3ͳͲʣ 3
Convolutional Neural NetworkʢΈࠐΈχϡʔϥϧωο τϫʔΫʣ 2 Recurrent Neural Networkʢ࠶ؼܕχϡʔϥϧωοτϫ ʔΫʣ 1 Deep Neural NetworkʢσΟʔϓχϡʔϥϧωοτϫʔ Ϋʣ
VisionͰೝࣝͰ͖Δͷ
VisionͰೝࣝͰ͖Δͷᶃ — إݕग़ / Face Detection and Recognition — όʔίʔυݕग़
/ Barcode Detection — ը૾ͷҐஔ߹Θͤ / Image Alignment Analysis — ςΩετݕग़ / Text Detection — ਫฏઢݕग़ / Horizon Detection
VisionͰೝࣝͰ͖Δͷᶄ ػցֶशϞσϧͷ༻ҙ͕ඞཁͳͷ — ΦϒδΣΫτݕग़ͱτϥοΩϯά / Object Detection and Tracking —
ػցֶशʹΑΔը૾ੳ / Machine Learning Image Analysis
Χϝϥը૾Λผ͢ΔαϯϓϧΞ ϓϦΛͭ͘Δ
αϯϓϧΞϓϦ֓ཁ — VisionͷʮػցֶशʹΑΔը૾ੳʯػೳΛར༻ — ΧϝϥͰөͨ͠ը૾Λผ͠ɺϞϊͷ໊લΛग़ྗ
ػցֶशʹΑΔը૾ೝࣝͷྲྀΕ 1. ֶशͷͨΊը૾σʔλΛऩूʢڭࡐΛूΊΔʣ 2. ֶश༻σʔλ͔ΒɺػցֶशΞϧΰϦζϜʹΑΓϞσϧΛ࡞ ※Ϟσϧɾɾɾ͑Λग़ͯ͘͠ΕΔϩδοΫ ྨɿ͜ͷը૾ݘʁೣʁ ճؼɿ༧ଌʢ໌ͷגՁʁʣ 3.
ֶशࡁΈϞσϧΛ༻͍ͯະͷը૾Λผʢ࣮ફʣ
Ϟσϧ࡞ׂѪ — ֶशσʔλͷऩूɾܗׂΓͱେม — ͦΕͳΓͷϚγϯεϖοΫɺܭࢉ͕࣌ؒඞཁ — ػցֶशʹؔ͢Δ͕ࣝඞཁ
Ϟσϧͷ༻ҙ ؆୯ͷͨΊɺֶशࡁΈϞσϧΛར༻ AppleͷαΠτͰ͞Ε͍ͯΔʢ.mlmodelܗࣜʣ https://developer.apple.com/machine-learning/
ϞσϧҰཡ ϞσϧʹΑͬͯಘҙͳը૾ͷछྨ༰ྔ͕ҟͳΔ ʢ5MBʙ553.5MBʣ — MobileNets — SqueezeNet — Places205-GoogLeNet —
ResNet50 — Inception v3 — VGG16
ࠓճResNet50Λར༻ — थɺಈɺ৯ɺΓɺਓͳͲͷ1000छྨͷΧςΰϦ — αΠζ102.6 MB — MITϥΠηϯε
ϞσϧΛϓϩδΣΫτʹࠐΉ
Xcodeʹυϥοά&υϩοϓ
ϞσϧΫϥε͕ࣗಈੜ͞ΕΔ ࣗಈͰϞσϧ໊.swiftͱ͍͏໊લͰϞσϧΫϥε͕࡞͞ΕΔ ྫ) Resnet50.swiftʢҰ෦ൈਮʣ
Χϝϥը૾ͷΩϟϓνϟॲཧ
private func startCapture() { let captureSession = AVCaptureSession() captureSession.sessionPreset =
AVCaptureSessionPresetPhoto // ೖྗͷࢦఆ let captureDevice = AVCaptureDevice.defaultDevice(withMediaType: AVMediaTypeVideo) guard let input = try? AVCaptureDeviceInput(device: captureDevice) else { return } guard captureSession.canAddInput(input) else { return } captureSession.addInput(input) // ग़ྗͷࢦఆ let output: AVCaptureVideoDataOutput = AVCaptureVideoDataOutput() output.setSampleBufferDelegate(self, queue: DispatchQueue(label: "VideoQueue")) guard captureSession.canAddOutput(output) else { return } captureSession.addOutput(output) // ϓϨϏϡʔͷࢦఆ guard let previewLayer = AVCaptureVideoPreviewLayer(session: captureSession) else { return } previewLayer.videoGravity = AVLayerVideoGravityResizeAspectFill previewLayer.frame = view.bounds view.layer.insertSublayer(previewLayer, at: 0) // Ωϟϓνϟ։࢝ captureSession.startRunning() }
ࡱӨϑϨʔϜຖʹݺΕΔDeleate extension ViewController: AVCaptureVideoDataOutputSampleBufferDelegate { func captureOutput(_ output: AVCaptureOutput!, didOutputSampleBuffer
sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!) { // CMSampleBufferΛCVPixelBufferʹม guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return } // ͜ͷதʹVision.frameworkͷॲཧΛॻ͍͍ͯ͘ʢը૾ೝࣝ෦ʣ } }
ը૾ೝࣝ෦ͷॲཧ
VisionͰར༻͢ΔओͳΫϥε — VNCoreMLModel — VNCoreMLRequest — VNImageRequestHandler — VNObservation
VNCoreMLModel — CoreMLͷϞσϧΛVisionͰѻ͏ͨΊͷίϯςφΫϥε
VNCoreMLRequest — CoreMLʹը૾ೝࣝΛཁٻ͢ΔͨΊͷΫϥε — ೝࣝ݁ՌϞσϧͷग़ྗܗࣜʹΑΓܾ·Δ — ը૾→Ϋϥεʢྨ݁Ռʣ — ը૾→ಛྔ —
ը૾→ը૾
VNImageRequestHandler — Ұͭͷը૾ʹର͠ɺҰͭҎ্ͷը૾ೝࣝॲཧ ʢVNCoreMLRequestʣΛ࣮ߦ͢ΔͨΊͷΫϥε — ॳظԽ࣌ʹೝࣝରͷը૾ܗࣜΛࢦఆ͢Δ — CVPixelBuffer — CIImage
— CGImage
VNObservation — ը૾ೝࣝ݁ՌͷநΫϥε — ݁Ռͱͯ͜͠ͷΫϥεͷαϒΫϥεͷ͍ͣΕ͔͕ฦ͞ΕΔ — ೝࣝͷ֬৴Λද͢confidenceϓϩύςΟΛ࣋ͭ ʢVNConfidence=FloatͷΤΠϦΞεʣ
VNObservationαϒΫϥε — VNClassificationObservation ྨ໊ͱͯ͠identifierϓϩύςΟΛ࣋ͭ — VNCoreMLFeatureValueObservation ಛྔσʔλͱͯ͠featureValueϓϩύςΟΛ࣋ͭ — VNPixelBufferObservation ը૾σʔλͱͯ͠pixelBufferϓϩύςΟΛ࣋ͭ
·ͱΊΔͱ… — VNCoreMLModelʢΈࠐΜͩϞσϧʣ — VNCoreMLRequestʢը૾ೝࣝͷϦΫΤετʣ — VNImageRequestHandlerʢϦΫΤετͷ࣮ߦʣ — VNObservationʢೝࣝ݁Ռʣ
۩ମతͳ࣮ίʔυ
ϞσϧΫϥεͷॳظԽ // CoreMLͷϞσϧΫϥεͷॳظԽ guard let model = try? VNCoreMLModel(for: Resnet50().model)
else { return }
ը૾ೝࣝϦΫΤετΛ࡞ // ը૾ೝࣝϦΫΤετΛ࡞ʢҾϞσϧͱϋϯυϥʣ let request = VNCoreMLRequest(model: model) { [weak
self] (request: VNRequest, error: Error?) in guard let results = request.results as? [VNClassificationObservation] else { return } // ผ݁Ռͱͦͷ֬৴Λ্Ґ3݅·Ͱදࣔ // identifierΧϯϚ۠ΓͰෳॻ͔Ε͍ͯΔ͜ͱ͕͋ΔͷͰɺ࠷ॳͷ୯ޠͷΈऔಘ͢Δ let displayText = results.prefix(3) .flatMap { "\(Int($0.confidence * 100))% \($0.identifier.components(separatedBy: ", ")[0])" } .joined(separator: "\n") DispatchQueue.main.async { self?.textView.text = displayText } }
ը૾ೝࣝϦΫΤετΛ࣮ߦ // CVPixelBufferʹର͠ɺը૾ೝࣝϦΫΤετΛ࣮ߦ try? VNImageRequestHandler(cvPixelBuffer: pixelBuffer, options: [:]).perform([request])
ը૾ೝࣝ෦ͷܗ guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
guard let model = try? VNCoreMLModel(for: Resnet50().model) else { return } let request = VNCoreMLRequest(model: model) { [weak self] (request: VNRequest, error: Error?) in guard let results = request.results as? [VNClassificationObservation] else { return } let displayText = results.prefix(3) .flatMap { "\(Int($0.confidence * 100))% \($0.identifier.components(separatedBy: ", ")[0])" } .joined(separator: "\n") DispatchQueue.main.async { self?.textView.text = displayText } } try? VNImageRequestHandler(cvPixelBuffer: pixelBuffer, options: [:]).perform([request])
σϞಈը
None
tabbyͬͯԿʁ
tabby = τϥωίʂ τϥωίͱɺτϥͷΑ͏ͳࣶ༷Λ࣋ͭωίͷ͜ͱͰ͋ΔɻλϏʔͱݺΕΔɻτϥೣ ετϥΠϓͷଞʹɺ్ࣶ༷͕Εͯɺൗ༷ɺᤳᤶ൝ɺࡉ͔ࣶ༷͘Λ్Εͤͯͨ͞ ͷ͕͋Γɺଟ༷Ͱ͋ΔɻʢҾ༻: ΟΩϖσΟΞʣ
·ͱΊ
— ֶशࡁΈϞσϧ͕͋Εɺ࣮ࣗମ؆୯! — ωίͷछྨڭ͑ͯ͘ΕΔ" — ͋ͱϞσϧࣗͰ࡞ΕΔΑ͏ʹͳΕͬͱ෯͕͕ Δ
ॳΓ͔ͨͬͨ͜ͱ — ΠϯελάϥϜ༻ͷࣗಈϋογϡλά͚ΞϓϦ — ϋογϡλάΛ͢ΩϟϓγϣϯAPIطʹഇࢭʘ(^o^)ʗ
αϯϓϧίʔυ ࠓճ͝հͨ͠αϯϓϧίʔυͪ͜Βʹஔ͍ͯ͋Γ·͢ɻ https://github.com/shtnkgm/VisionFrameworkSample ※εΫϦʔϯγϣοτͷެ։ʹNDAҙ
͓ΘΓ
ࢀߟࢿྉᶃ — Build more intelligent apps with machine learning. /
Apple — Vision / Apple Developer Documentation — ʲWWDC2017ʳVision.framework ͷςΩετݕग़Λࢼ͠ ͯΈ·ͨ͠ʲiOS11ʳ — Keras + iOS11 CoreML + Vision Framework ʹΑΔɺ ΫϩإࣝผΞϓϦͷ։ൃ — [Core ML] .mlmodel ϑΝΠϧΛ࡞͢Δ / ϑΣϯϦϧ
ࢀߟࢿྉᶄ — [iOS 11] CoreMLͰը૾ͷࣝผΛࢼͯ͠Έ·ͨ͠ ʢVision.FrameworkΛΘͳ͍ύλʔϯʣ #WWDC2017 — Places205-GoogLeNetͰॴͷఆ /
fabo.io — iOSDCͷϦδΣΫτίϯͰʰiOSͱσΟʔϓϥʔχϯάʱʹ ͍ͭͯ͠·ͨ͠Add Star — [iOS 10][χϡʔϥϧωοτϫʔΫ] OSSͰAccelerateʹՃ ͞ΕͨBNNSΛཧղ͢Δ ~XORฤ~