Save 37% off PRO during our Black Friday Sale! »

20170829_iOSLT_機械学習とVision.framework

E45f9c343d90c74554c65c89c6f861bc?s=47 shtnkgm
September 20, 2017

 20170829_iOSLT_機械学習とVision.framework

機械学習の基礎的な内容を交えつつ、iOS11で追加されたVision.frameworkの説明とデモ

E45f9c343d90c74554c65c89c6f861bc?s=128

shtnkgm

September 20, 2017
Tweet

Transcript

  1. ػցֶशͱVision.framework Shota Nakagami / @shtnkgm 2017/8/29

  2. ࿩͢಺༰ — Vision.frameworkͷجຊతͳઆ໌ — ػցֶशͷ֓ཁ — VisionΛ༻͍ͨΧϝϥը૾Λ൑ผ͢ΔαϯϓϧΞϓϦ

  3. Vision.frameworkͱ͸ — iOS11͔Β௥Ճ͞Εͨը૾ೝࣝAPIΛఏڙ͢ΔϑϨʔϜϫʔ Ϋ — ಉ͘͡iOS11͔Β௥Ճ͞ΕͨػցֶशϑϨʔϜϫʔΫͷCore MLΛந৅Խ

  4. ػցֶशελοΫ

  5. χϡʔϥϧωοτϫʔΫͱ͸ — ػցֶशख๏ͷҰछ — ਓؒͷ೴ͷਆܦճ࿏໢Λ ਺ࣜϞσϧͰදͨ͠΋ͷ — NNͱུ͞ΕΔ ʢDNN1ɺRNN2ɺCNN3ͳͲʣ 3

    Convolutional Neural Networkʢ৞ΈࠐΈχϡʔϥϧωο τϫʔΫʣ 2 Recurrent Neural Networkʢ࠶ؼܕχϡʔϥϧωοτϫ ʔΫʣ 1 Deep Neural NetworkʢσΟʔϓχϡʔϥϧωοτϫʔ Ϋʣ
  6. VisionͰೝࣝͰ͖Δ΋ͷ

  7. VisionͰೝࣝͰ͖Δ΋ͷᶃ — إݕग़ / Face Detection and Recognition — όʔίʔυݕग़

    / Barcode Detection — ը૾ͷҐஔ߹Θͤ / Image Alignment Analysis — ςΩετݕग़ / Text Detection — ਫฏઢݕग़ / Horizon Detection
  8. VisionͰೝࣝͰ͖Δ΋ͷᶄ ػցֶशϞσϧͷ༻ҙ͕ඞཁͳ΋ͷ — ΦϒδΣΫτݕग़ͱτϥοΩϯά / Object Detection and Tracking —

    ػցֶशʹΑΔը૾෼ੳ / Machine Learning Image Analysis
  9. Χϝϥը૾Λ൑ผ͢ΔαϯϓϧΞ ϓϦΛͭ͘Δ

  10. αϯϓϧΞϓϦ֓ཁ — VisionͷʮػցֶशʹΑΔը૾෼ੳʯػೳΛར༻ — ΧϝϥͰөͨ͠ը૾Λ൑ผ͠ɺϞϊͷ໊લΛग़ྗ

  11. ػցֶशʹΑΔը૾ೝࣝͷྲྀΕ 1. ֶशͷͨΊը૾σʔλΛऩूʢڭࡐΛूΊΔʣ 2. ֶश༻σʔλ͔ΒɺػցֶशΞϧΰϦζϜʹΑΓϞσϧΛ࡞ ੒ ※Ϟσϧɾɾɾ౴͑Λग़ͯ͘͠ΕΔϩδοΫ ෼ྨɿ͜ͷը૾͸ݘʁೣʁ ճؼɿ਺஋༧ଌʢ໌೔ͷגՁ͸ʁʣ 3.

    ֶशࡁΈϞσϧΛ༻͍ͯະ஌ͷը૾Λ൑ผʢ࣮ફʣ
  12. Ϟσϧ࡞੒͸ׂѪ — ֶशσʔλͷऩूɾ੔ܗ͸ׂΓͱେม — ͦΕͳΓͷϚγϯεϖοΫɺܭࢉ͕࣌ؒඞཁ — ػցֶशʹؔ͢Δ஌͕ࣝඞཁ

  13. Ϟσϧͷ༻ҙ ؆୯ͷͨΊɺֶशࡁΈϞσϧΛར༻ AppleͷαΠτͰ഑෍͞Ε͍ͯΔʢ.mlmodelܗࣜʣ https://developer.apple.com/machine-learning/

  14. ഑෍ϞσϧҰཡ ϞσϧʹΑͬͯಘҙͳը૾ͷछྨ΍༰ྔ͕ҟͳΔ ʢ5MBʙ553.5MBʣ — MobileNets — SqueezeNet — Places205-GoogLeNet —

    ResNet50 — Inception v3 — VGG16
  15. ࠓճ͸ResNet50Λར༻ — थ໦ɺಈ෺ɺ৯෺ɺ৐Γ෺ɺਓͳͲͷ1000छྨͷΧςΰϦ — αΠζ͸102.6 MB — MITϥΠηϯε

  16. ϞσϧΛϓϩδΣΫτʹ૊ࠐΉ

  17. Xcodeʹυϥοά&υϩοϓ

  18. ϞσϧΫϥε͕ࣗಈੜ੒͞ΕΔ ࣗಈͰϞσϧ໊.swiftͱ͍͏໊લͰϞσϧΫϥε͕࡞੒͞ΕΔ ྫ) Resnet50.swiftʢҰ෦ൈਮʣ

  19. Χϝϥը૾ͷΩϟϓνϟॲཧ

  20. private func startCapture() { let captureSession = AVCaptureSession() captureSession.sessionPreset =

    AVCaptureSessionPresetPhoto // ೖྗͷࢦఆ let captureDevice = AVCaptureDevice.defaultDevice(withMediaType: AVMediaTypeVideo) guard let input = try? AVCaptureDeviceInput(device: captureDevice) else { return } guard captureSession.canAddInput(input) else { return } captureSession.addInput(input) // ग़ྗͷࢦఆ let output: AVCaptureVideoDataOutput = AVCaptureVideoDataOutput() output.setSampleBufferDelegate(self, queue: DispatchQueue(label: "VideoQueue")) guard captureSession.canAddOutput(output) else { return } captureSession.addOutput(output) // ϓϨϏϡʔͷࢦఆ guard let previewLayer = AVCaptureVideoPreviewLayer(session: captureSession) else { return } previewLayer.videoGravity = AVLayerVideoGravityResizeAspectFill previewLayer.frame = view.bounds view.layer.insertSublayer(previewLayer, at: 0) // Ωϟϓνϟ։࢝ captureSession.startRunning() }
  21. ࡱӨϑϨʔϜຖʹݺ͹ΕΔDeleate extension ViewController: AVCaptureVideoDataOutputSampleBufferDelegate { func captureOutput(_ output: AVCaptureOutput!, didOutputSampleBuffer

    sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!) { // CMSampleBufferΛCVPixelBufferʹม׵ guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return } // ͜ͷதʹVision.frameworkͷॲཧΛॻ͍͍ͯ͘ʢը૾ೝࣝ෦෼ʣ } }
  22. ը૾ೝࣝ෦෼ͷॲཧ

  23. VisionͰར༻͢ΔओͳΫϥε — VNCoreMLModel — VNCoreMLRequest — VNImageRequestHandler — VNObservation

  24. VNCoreMLModel — CoreMLͷϞσϧΛVisionͰѻ͏ͨΊͷίϯςφΫϥε

  25. VNCoreMLRequest — CoreMLʹը૾ೝࣝΛཁٻ͢ΔͨΊͷΫϥε — ೝࣝ݁Ռ͸Ϟσϧͷग़ྗܗࣜʹΑΓܾ·Δ — ը૾→Ϋϥεʢ෼ྨ݁Ռʣ — ը૾→ಛ௃ྔ —

    ը૾→ը૾
  26. VNImageRequestHandler — Ұͭͷը૾ʹର͠ɺҰͭҎ্ͷը૾ೝࣝॲཧ ʢVNCoreMLRequestʣΛ࣮ߦ͢ΔͨΊͷΫϥε — ॳظԽ࣌ʹೝࣝର৅ͷը૾ܗࣜΛࢦఆ͢Δ — CVPixelBuffer — CIImage

    — CGImage
  27. VNObservation — ը૾ೝࣝ݁Ռͷந৅Ϋϥε — ݁Ռͱͯ͜͠ͷΫϥεͷαϒΫϥεͷ͍ͣΕ͔͕ฦ͞ΕΔ — ೝࣝͷ֬৴౓Λද͢confidenceϓϩύςΟΛ࣋ͭ ʢVNConfidence=FloatͷΤΠϦΞεʣ

  28. VNObservationαϒΫϥε — VNClassificationObservation ෼ྨ໊ͱͯ͠identifierϓϩύςΟΛ࣋ͭ — VNCoreMLFeatureValueObservation ಛ௃ྔσʔλͱͯ͠featureValueϓϩύςΟΛ࣋ͭ — VNPixelBufferObservation ը૾σʔλͱͯ͠pixelBufferϓϩύςΟΛ࣋ͭ

  29. ·ͱΊΔͱ… — VNCoreMLModelʢ૊ΈࠐΜͩϞσϧʣ — VNCoreMLRequestʢը૾ೝࣝͷϦΫΤετʣ — VNImageRequestHandlerʢϦΫΤετͷ࣮ߦʣ — VNObservationʢೝࣝ݁Ռʣ

  30. ۩ମతͳ࣮૷ίʔυ

  31. ϞσϧΫϥεͷॳظԽ // CoreMLͷϞσϧΫϥεͷॳظԽ guard let model = try? VNCoreMLModel(for: Resnet50().model)

    else { return }
  32. ը૾ೝࣝϦΫΤετΛ࡞੒ // ը૾ೝࣝϦΫΤετΛ࡞੒ʢҾ਺͸Ϟσϧͱϋϯυϥʣ let request = VNCoreMLRequest(model: model) { [weak

    self] (request: VNRequest, error: Error?) in guard let results = request.results as? [VNClassificationObservation] else { return } // ൑ผ݁Ռͱͦͷ֬৴౓Λ্Ґ3݅·Ͱදࣔ // identifier͸ΧϯϚ۠੾ΓͰෳ਺ॻ͔Ε͍ͯΔ͜ͱ͕͋ΔͷͰɺ࠷ॳͷ୯ޠͷΈऔಘ͢Δ let displayText = results.prefix(3) .flatMap { "\(Int($0.confidence * 100))% \($0.identifier.components(separatedBy: ", ")[0])" } .joined(separator: "\n") DispatchQueue.main.async { self?.textView.text = displayText } }
  33. ը૾ೝࣝϦΫΤετΛ࣮ߦ // CVPixelBufferʹର͠ɺը૾ೝࣝϦΫΤετΛ࣮ߦ try? VNImageRequestHandler(cvPixelBuffer: pixelBuffer, options: [:]).perform([request])

  34. ը૾ೝࣝ෦෼ͷ׬੒ܗ guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }

    guard let model = try? VNCoreMLModel(for: Resnet50().model) else { return } let request = VNCoreMLRequest(model: model) { [weak self] (request: VNRequest, error: Error?) in guard let results = request.results as? [VNClassificationObservation] else { return } let displayText = results.prefix(3) .flatMap { "\(Int($0.confidence * 100))% \($0.identifier.components(separatedBy: ", ")[0])" } .joined(separator: "\n") DispatchQueue.main.async { self?.textView.text = displayText } } try? VNImageRequestHandler(cvPixelBuffer: pixelBuffer, options: [:]).perform([request])
  35. σϞಈը

  36. None
  37. tabbyͬͯԿʁ

  38. tabby = τϥωίʂ τϥωίͱ͸ɺτϥͷΑ͏ͳࣶ໛༷Λ࣋ͭωίͷ͜ͱͰ͋ΔɻλϏʔͱ΋ݺ͹ΕΔɻτϥೣ ͸ετϥΠϓͷଞʹɺࣶ໛్༷͕੾Εͯɺൗ໛༷ɺᤳᤶ൝ɺࡉ͔ࣶ͘໛༷Λ్੾Εͤͯͨ͞ ΋ͷ౳͕͋Γɺଟ༷Ͱ͋ΔɻʢҾ༻: ΢ΟΩϖσΟΞʣ

  39. ·ͱΊ

  40. — ֶशࡁΈϞσϧ͕͋Ε͹ɺ࣮૷ࣗମ͸؆୯! — ωίͷछྨ΋ڭ͑ͯ͘ΕΔ" — ͋ͱ͸Ϟσϧ΋ࣗ෼Ͱ࡞ΕΔΑ͏ʹͳΕ͹΋ͬͱ෯͕޿͕ Δ

  41. ౰ॳ΍Γ͔ͨͬͨ͜ͱ — ΠϯελάϥϜ༻ͷࣗಈϋογϡλά෇͚ΞϓϦ — ϋογϡλάΛ౉͢ΩϟϓγϣϯAPI͸طʹഇࢭʘ(^o^)ʗ

  42. αϯϓϧίʔυ ࠓճ͝঺հͨ͠αϯϓϧίʔυ͸ͪ͜Βʹஔ͍ͯ͋Γ·͢ɻ https://github.com/shtnkgm/VisionFrameworkSample ※εΫϦʔϯγϣοτͷެ։ʹ͸NDA஫ҙ

  43. ͓ΘΓ

  44. ࢀߟࢿྉᶃ — Build more intelligent apps with machine learning. /

    Apple — Vision / Apple Developer Documentation — ʲWWDC2017ʳVision.framework ͷςΩετݕग़Λࢼ͠ ͯΈ·ͨ͠ʲiOS11ʳ — Keras + iOS11 CoreML + Vision Framework ʹΑΔɺ΋΋ ΫϩإࣝผΞϓϦͷ։ൃ — [Core ML] .mlmodel ϑΝΠϧΛ࡞੒͢Δ / ϑΣϯϦϧ
  45. ࢀߟࢿྉᶄ — [iOS 11] CoreMLͰը૾ͷࣝผΛࢼͯ͠Έ·ͨ͠ ʢVision.FrameworkΛ࢖Θͳ͍ύλʔϯʣ #WWDC2017 — Places205-GoogLeNetͰ৔ॴͷ൑ఆ /

    fabo.io — iOSDCͷϦδΣΫτίϯͰʰiOSͱσΟʔϓϥʔχϯάʱʹ ͍ͭͯ࿩͠·ͨ͠Add Star — [iOS 10][χϡʔϥϧωοτϫʔΫ] OSSͰAccelerateʹ௥Ճ ͞ΕͨBNNSΛཧղ͢Δ ~XORฤ~