サーバレスの動画・画像解析プラットフォーム Media Insights Engine さわってみた / Introduce Media Insights Engine: a serverless media analysis framework

サーバレスの動画・画像解析プラットフォーム Media Insights Engine さわってみた 2019-12-19 RecoChoku Tech Night 5社合同
AWS re:Invent 2019 参加報告会株式会社ミクシィみてね事業部松石浩輔

自己紹介 • 松石浩輔 (twitter: @_sobataro) • 株式会社ミクシィ Vantageスタジオみてね事業部
• 家族アルバム「みてね」のエンジニア 2

自己紹介 • 松石浩輔 (twitter: @_sobataro) • 株式会社ミクシィ Vantageスタジオみてね事業部
• 家族アルバム「みてね」のエンジニア • 「1秒動画」や「自動提案フォトブック」などコンテンツ自動生成・自動提案機能の開発・運用 • 画像・動画の解析基盤の構築・運用 4

目次 • はじめに • Media Insights Engine とは • Media
Insights Engine のアーキテクチャ • Media Insights Engine のデプロイ • サンプル Web アプリのデモ • Custom Operator の作りかた • まとめ 5

はじめに今回の発表内容 • re:Invent 2019 の Builders Session Extract value
from content archives with Media Insights Engine の話 ◦ Builders Session: AWS の開発者1〜2名と参加者6名程度で行われるセッション ◦ スライドはこちら • をもとに、実際に Media Insights Engine を試した話 • 資料は Speaker Deck で公開しています: https://speakerdeck.com/_sobataro/introduce-media-insights-engine- a-serverless-media-analysis-framework • (今回の re:Invent で新規に発表されたサービスではありません) 6

Media Insights Engine とは 7

Media Insights Engine の概要 • 動画・画像・音声・テキストの解析ワークフローをサーバレスで素早く構築するためのフレームワーク • https://github.com/awslabs/aws-media-insights-engine •
各種 AI/ML サービスを AWS Step Functions で束ねて API から実行可能に ◦ Amazon Rekognition ◦ Amazon Transcribe, Amazon Translate, Amazon Polly ◦ AWS Elemental MediaConvert ◦ その他なんでも (Custom Operator) • ユースケース ◦ メディアのコンテンツ解析とインデクシング ◦ メディアのコンテンツモデレーション ◦ メディアの文字起こし・翻訳・音声合成 • 現時点では preview の段階 ◦ 将来的には Well-Architected のリファレンスアーキテクチャを目指す？ 8

Media Insights Engine のアーキテクチャ 9

Media Insights Engine のアーキテクチャ (1/3) Workflow (橙) • 入力されたメディアを解析・変換する一連の処理の流れ
• AWS Step Functions により定義 Operator (水色) • Workflow の構成要素 • メディアに対する解析・変換処理 ◦ 例: Amazon Transcribe による Speech-to-Text • AWS Step Functions により定義 • Custom Operator も実装可能 https://github.com/awslabs/aws-media-insights-engine 11

Media Insights Engine のアーキテクチャ (2/3) Control Plane (橙) • メディアに対して
Workflow を実行 • Data Plane とのやりとり Data Plane (水色) • 入力メディア自体と Workflow の出力するメタデータ (解析結果) を保存 Workflow API • メディア解析を受け付ける API • Workflow を定義して実行開始 https://github.com/awslabs/aws-media-insights-engine 12

Media Insights Engine のアーキテクチャ (3/3) Data Plane Pipeline (橙) •
入力メディアのメタデータ (解析結果) を DynamoDB, S3 に保存 • Kinesis Data Streams 経由で Consumer へ出力 Data Plane Consumer (水色) • Lambda 経由でメタデータを保存・活用 • Elasticsearch Consumer が標準提供 • Custom Consumer も実装可能 https://github.com/awslabs/aws-media-insights-engine 13

Media Insights Engine のデプロイ 14

Media Insights Engine のデプロイ • デモアーキテクチャのビルド済み CloudFormation template が利用可能 ◦
数クリック + 30分程度の待ち時間でテスト可能 ◦ テスト後の削除も CloudFormation の Delete Stack と s3 bucket の手動削除のみ (多分) • ソースコードからのビルドスクリプトと開発者向けガイドも提供 ◦ ビルドスクリプト (シェルスクリプト) が CloudFormation template と必要な各種素材を S3 に出力する ◦ 環境さえ整えば5〜10分程度でビルド可能 15

1. GitHub にある Launch Stack をクリック 2. Next をクリック 16

3. Stack name, Admin Email を入力して Next 4. Next をクリック
17

5. ページ末尾の警告を読んで Create stack 6. 30分程度で準備完了 18

サンプル Web アプリのデモ 19

1. Sign-in 画面 2. 動画をアップロードここで Workflow の設定を変更できる 20

3. Collection 画面 4. 解析結果画面 Rekognition による一般物体認識結果 ※テスト用の動画は re:Invent 2019
の Keynote を利用: https://www.youtube.com/watch?v=7-31KgImGgU 21

5. Transcribe による文字起こし結果 6. Translate による翻訳結果 22

サンプル Web アプリの Workflow (Step Functions の state machine) 23

Video Stage Text Synthesis Stage Text Stage Audio Stage サンプル
Web アプリの Workflow (Step Functions の state machine) 24

サンプル Web アプリの Workflow (Step Functions の state machine) Generic
Lookup Thumb- nail Face Search Modera- tion Person Tracking Celebrity Face Detection Label Detection Media Convert Tran- scribe Entities Trans- late Key Phrase Custom (後述) Polly 25

Custom Operator のつくりかた 26

Custom Operator とは • Workflow 中で任意の処理を行う自前の Operator • Lambda
Function として実装以下の例 • テキストに対し、単語の頻度分布を出力する Custom Operator を実装 • サンプル Web アプリの Workflow に追加 ◦ Amazon Transcribe の出力した書き起こしテキストを処理する https://github.com/awslabs/aws-media-insights-engine Custom Operator のつくりかた (概要) 27

Custom Operator のつくりかた (概要) 必要な作業: 開発者向けガイドにある通り、以下の手順に従う 1. Operator の本体を Lambda
Function として実装する 2. Operator を Operator Library (CloudFormation template) に追加する ◦ Lambda Function ◦ IAM Role ◦ Control Plane への登録 ◦ Operator name の export 3. Operator を Workflow に追加する 4. Operator の出力を Data Pipeline Consumer で受け取れるようにする 5. ビルドスクリプトで Lambda Function をビルドできるようにする 6. 動作確認とデプロイソースコード全体はこちら 28

1. Operator を Lambda Function として定義 Amazon Transcribe の出力したテキストを取得
nltk で単語の頻度分布を取得結果を Data Plane に保存 Operator の終了処理 # source/operators/nltk/worddist.py import boto3, json, nltk from MediaInsightsEngineLambdaHelper import DataPlane from MediaInsightsEngineLambdaHelper import MediaInsightsOperationHelper def lambda_handler(event, context): operator = MediaInsightsOperationHelper(event) s3 = boto3.client('s3') transcribe_json = s3.get_object( Bucket=operator.input["Media"]["Text"]["S3Bucket"], Key=operator.input["Media"]["Text"]["S3Key"] )["Body"].read().decode("utf-8") transcribe_metadata = json.loads(transcribe_json) transcript = transcribe_metadata["results"]["transcripts"][0]["transcript"] words = nltk.word_tokenize(transcript) lower_words = [w.lower() for w in words] freqdist = nltk.FreqDist(lower_words) result = {"Results": dict(freqdist)} dataplane = DataPlane() metadata = dataplane.store_asset_metadata( operator.asset_id, operator.name, operator.workflow_execution_id, result ) operator.add_media_object('Text', metadata['Bucket'], metadata['Key']) operator.update_workflow_status("Complete") return operator.return_output_object() 29

# source/operators/operator-library.yaml Resources: worddistLambdaRole: Type: "AWS::IAM::Role" Properties: ManagedPolicyArns: - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
- arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess AssumeRolePolicyDocument: Version: "2012-10-17" Statement: - Action: - "sts:AssumeRole" Effect: "Allow" Principal: Service: "lambda.amazonaws.com" Policies: - PolicyName: "WorddistLambdaAccess" PolicyDocument: Version: "2012-10-17" Statement: - Action: - "s3:GetObject" - "s3:PutObject" Resource: !Sub "arn:aws:s3:::${DataPlaneBucket}/*" Effect: "Allow" - Action: "lambda:InvokeFunction" Resource: !Ref DataPlaneHandlerArn Effect: "Allow" 2. Operator を Operator Library に追加 (1/3) CloudFormation で Lamda Function の IAM Role を定義 30

# source/operators/operator-library.yaml Resources: startWorddist: Type: "AWS::Lambda::Function" Properties: Handler: "worddist.lambda_handler" Layers:
- !Ref MediaInsightsEnginePython37Layer Role: !GetAtt worddistLambdaRole.Arn Code: S3Bucket: !FindInMap ["SourceCode", "General", "S3Bucket"] S3Key: !Join [ "/", [ !FindInMap ["SourceCode", "General", "KeyPrefix"], "worddist.zip", ], ] Runtime: "python3.7" Timeout: 300 Environment: Variables: OPERATOR_NAME: "Worddist" DataplaneEndpoint: !Ref "DataPlaneEndpoint" WorddistRole: !GetAtt worddistLambdaRole.Arn botoConfig: '{"user_agent": "aws-tm-mie/python3.7/lambda"}' 2. Operator を Operator Library に追加 (2/3) CloudFormation で Lambda Function を定義 31

# source/operators/operator-library.yaml Resources: WorddistOperation: Type: Custom::CustomResource Properties: ServiceToken: !Ref WorkflowCustomResourceArn
ResourceType: "Operation" Name: "Worddist" Type: "Sync" Configuration: { "MediaType": "Text", "Enabled": true } StartLambdaArn: !GetAtt startWorddist.Arn StateMachineExecutionRoleArn: !GetAtt StepFunctionRole.Arn Outputs: WorddistOperation: Description: "Operation name of Worddist" Value: !GetAtt WorddistOperation.Name Export: Name: !Join [":", [!Ref "AWS::StackName", Worddist]] 2. Operator を Operator Library に追加 (3/3) CloudFormation で Operator を Control Plane に登録 Operator 名を出力 32

# source/workflows/MieCompleteWorkflow.yaml Resources: defaultTextStage: Type: Custom::CustomResource Properties: # 省略 Operations:
# 省略 - Fn::ImportValue: Fn::Sub: "${OperatorLibraryStack}:Worddist" 3. Operator を Workflow に追加 CloudFormation で Operator を Workflow の Text Stage に追加 33

その他の作業 4. Operator の出力を Data Pipeline Consumer で受け取れるようにする • source/consumers/elastic/lambda_handler.py
の実装 5. ビルドスクリプトで Lambda Function をビルドできるようにする • deployment/build-s3-dist.sh を修正 • Operator の実装を zip に固めて deployment/dist/ 以下に置くだけ 6. 動作確認とデプロイ • (Operator 単体の動作確認方法は省略) • build-s3-dist.sh を実行 • 出力される CloudFormation template を実行・適用 34

Workflow の例: サンプル Web アプリの Workflow (Step Functions の state
machine) (再掲) Generic Lookup Thumb- nail Face Search Modera- tion Person Tracking Celebrity Face Detection Label Detection Media Convert Tran- scribe Entities Trans- late Key Phrase Custom Operator Polly ここまでの作業で Custom Operator をサンプル Web アプリの Workflow のうち Text Stage に追加できた 35

Custom Operator を含む Workflow の実行と出力実行 • 今回はサンプル Web アプリから直接実行
• 自前で Workflow を定義したり、 API から Workflow を実行したりもできる出力の取得 • Dataplane API 経由で取得 • Data Pipeline API 経由で取得 GET /metadata/{asset_id} • Data Pipeline Consumer 経由で取得 { "Results": { ".": 49, "and": 46, "the": 39, "of": 33, ",": 33, "a": 25, "to": 25, "you": 24, "we": 23, "that": 21, "in": 16, "have": 16, "it": 15, "on": 13, "so": 10, "tensorflow": 9, # (省略) } } 36

まとめ 37

発表のおさらい • Media Insights Engine を紹介 ◦ 動画・画像・音声・テキストの解析ワークフローをサーバレスで素早く構築するためのフレームワーク •
アーキテクチャとデプロイ方法を説明 ◦ 主な構成要素: Workflow, Operator, Data Plane, Control Plane, Data Pipeline ◦ CloudFormation でデプロイ • サンプル Web アプリのデモ ◦ ブラウザから動画をアップロード ◦ Amazon の各種 AI/ML 系サービスで解析処理 ◦ 解析結果が Elasticsearch に載り検索可能になる • Custom Operator のつくりかた ◦ 開発者ガイドに従って実装 ◦ Lambda Function を作り Workflow に追加する 38

触ってみての個人的な所感 • preview 版というだけあって、直接使うのは気合がいる ◦ ドキュメントが不十分 ◦ Builders Session で話をした開発者も、まだ負荷テストはやっていないとのこと
• もう少し成熟して Well-Architected のリファレンスアーキテクチャに取り込まれたら便利そう • 新しく画像・動画・音声・テキストの解析基盤を構築する際には参考になる ◦ Lambda, SQS, DynamoDB, Step Functions などを組み合わせた構成自体 ◦ ちゃんと動く CloudFormation template も提供されている • 引き続き Watch していきたい 39

サーバレスの動画・画像解析プラットフォーム Media Insights Engine さわっ...

サーバレスの動画・画像解析プラットフォーム Media Insights Engine さわってみた / Introduce Media Insights Engine: a serverless media analysis framework

More Decks by _sobataro

Other Decks in Technology

Featured

Transcript