first step of ML Kit - Speaker Deck

Slide 1

Slide 1 text

ML Kit の概要と Base API Yuki Anzai @yanzm Google Developers Expert for Android

Slide 2

Slide 2 text

ML Kit とは • Firebase の機能の⼀つ • 機械学習を利⽤する機能をアプリに簡単に組み込むためのモバイル SDK • 現在は β • iOS と Android で使える • https://ﬁrebase.google.com/docs/ml-kit/

Slide 3

Slide 3 text

on-device or in the cloud On-device Cloud Text recognition : テキスト認識 O O Face detection : 顔検出 O - Barcode scanning : バーコードスキャン O - Image labeling : 画像のラベル付け O O Landmark recognition : ランドマーク認識 - O Custom model inference : カスタムモデル推論 O -

Slide 4

Slide 4 text

on-device vs Cloud • on-device API • ローカルで動作、速い • Firebase が機械学習のモデルをあらかじめダウンロードしてくれる • Cloud API • サーバーで処理、⾼機能 • ネットワーク接続が必要

Slide 5

Slide 5 text

Pricing https://ﬁrebase.google.com/pricing/ your account's ﬁrst 1000 Cloud Vision API calls/month are free

Slide 6

Slide 6 text

• Features • 画像からテキストを認識 • on-device API • 無料 • 全てのラテン⽂字を認識 • Cloud API • 毎⽉最初の 1000 API call は無料（1000+ からは従量課⾦） • 50 をこえる⾔語を認識（⽇本語含む） Text recognition (OCR) : テキスト認識

Slide 7

Slide 7 text

No content

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

• Features • on-device API のみ • 顔の領域、ランドマーク（⽬・頬・⿐・⽿・⼝）の位置認識 • 顔の表情（⽬の開閉度合い、笑顔の度合い）の認識 • 動画のフレーム間で同じ顔をトラック可能 • 100以上の点から構成される2次元の輪郭情報（顔の外郭・⽬・眉・⿐・⼝） Face detection : 顔検出

Slide 10

Slide 10 text

Face contour https://ﬁrebase.google.com/docs/ml-kit/detect-faces

Slide 11

Slide 11 text

• Features • on-device API のみ • ほとんどの標準フォーマットをサポート • 1次元フォーマット : Codabar, Code 39, Code 93, Code 128, EAN-8, EAN-13, ITF, UPC-A, UPC-E • 2次元フォーマット : Aztec, Data Matrix, PDF417, QR Code • ⾃動フォーマット検出 • structured data の取り出し • バーコードの向きによらず検出可能 Barcode scanning : バーコードスキャン

Slide 12

Slide 12 text

format : 256 valueType : 9 rawValue : WIFI:S:SB1Guest;P:12345;T:WEP;; displayValue : SB1Guest 12345 boundingBox : Rect(300, 457 - 669, 824) encryptionType : 3 ssid : SB1Guest password : 12345

Slide 13

Slide 13 text

• Features • 画像の内容を解析し、認識したもののラベルをつける : ⼈、物、場所、活動など • on-device API • 無料 • 400+ labels をサポート • Cloud API • 毎⽉最初の 1000 API call は無料（1000+ からは従量課⾦） • 10,000+ labels をサポート Image labeling : 画像のラベル付け

Slide 14

Slide 14 text

label : Building confidence : 0.77894384 entityId : /m/0cgh4 label : Palace confidence : 0.75397676 entityId : /m/05zp8 label : landmark confidence : 0.9432406 entityId : /m/05_5t0l label : town confidence : 0.9333225 entityId : /m/0dx1j

Slide 15

Slide 15 text

label : Food confidence : 0.9649049 entityId : /m/02wbm label : Cuisine confidence : 0.91778296 entityId : /m/01ykh label : food confidence : 0.9399401 entityId : /m/02wbm label : cuisine confidence : 0.9263104 entityId : /m/01ykh

Slide 16

Slide 16 text

val options = FirebaseVisionLabelDetectorOptions .Builder() .setConfidenceThreshold(0.9f) .build()

Slide 17

Slide 17 text

• Features • 画像から有名なランドマークを認識 • ランドマーク名 • 地理座標 • Knowledge Graph entity ID • 画像内でのランドマークの領域 • 毎⽉最初の 1000 API call は無料（1000+ からは従量課⾦） Landmark detection : ランドマーク認識

Slide 18

Slide 18 text

landmark : Amsterdam Centraal Railway Station conﬁdence : 0.86155003 entityId : /m/0bbw52 locations : 52.378068, 4.899774 boundingBox : Rect(33, 504 - 956, 928) landmark : Amsterdam conﬁdence : 0.5167069 entityId : /m/0k3p locations : 52.373811, 4.890951 boundingBox : Rect(187, 644 - 757, 843)

Slide 19

Slide 19 text

カスタムモデル推論 • Firebase で TensorFlow Lite のモデルをホスティング • Firebase SDK がモデルのダウンロードをハンドリング • モデルの更新も可能 • apk にバンドルしたモデルを Firebase SDK 経由で利⽤することも可能 • on-device API として利⽤