Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Speech Frameworkを使った音声認識の基本
Search
Masashi
January 29, 2025
Programming
0
49
Speech Frameworkを使った音声認識の基本
Mobile勉強会 ウォンテッドリー × チームラボ × Sansan #18 〜新技術の導入〜
https://sansan.connpass.com/event/338706/
Masashi
January 29, 2025
Tweet
Share
More Decks by Masashi
See All by Masashi
Eight iOSを支えるアーキテクチャ
kawabe
1
610
これだけは伝えたい設計の技術
kawabe
0
1.3k
EightのUI Component化の取り組み
kawabe
0
130
Other Decks in Programming
See All in Programming
Android 16KBページサイズ対応をはじめからていねいに
mine2424
0
240
AI コーディングエージェントの時代へ:JetBrains が描く開発の未来
masaruhr
1
200
Startups on Rails in Past, Present and Future–Irina Nazarova, RailsConf 2025
irinanazarova
0
180
NPOでのDevinの活用
codeforeveryone
0
870
Rubyでやりたい駆動開発 / Ruby driven development
chobishiba
1
750
ソフトウェア品質を数字で捉える技術。事業成長を支えるシステム品質の マネジメント
takuya542
2
14k
AIプログラマーDevinは PHPerの夢を見るか?
shinyasaita
1
240
Google Agent Development Kit でLINE Botを作ってみた
ymd65536
2
260
Advanced Micro Frontends: Multi Version/ Framework Scenarios @WAD 2025, Berlin
manfredsteyer
PRO
0
360
Goで作る、開発・CI環境
sin392
0
260
新メンバーも今日から大活躍!SREが支えるスケールし続ける組織のオンボーディング
honmarkhunt
5
8.2k
AIと”コードの評価関数”を共有する / Share the "code evaluation function" with AI
euglena1215
1
180
Featured
See All Featured
Build your cross-platform service in a week with App Engine
jlugia
231
18k
Gamification - CAS2011
davidbonilla
81
5.4k
Practical Orchestrator
shlominoach
189
11k
Thoughts on Productivity
jonyablonski
69
4.7k
The Straight Up "How To Draw Better" Workshop
denniskardys
235
140k
GraphQLの誤解/rethinking-graphql
sonatard
71
11k
Docker and Python
trallard
45
3.5k
The Pragmatic Product Professional
lauravandoore
35
6.7k
How to train your dragon (web standard)
notwaldorf
96
6.1k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
107
19k
Scaling GitHub
holman
460
140k
jQuery: Nuts, Bolts and Bling
dougneiner
63
7.8k
Transcript
Speech FrameworkΛͬͨ Իೝࣝͷجຊ
Տล խ࢙ɹMasashi Kawabe NOT A HOTEL גࣜձࣾ Smart Home νʔϜ
ιϑτΣΞΤϯδχΞ
ࣨʹεΠονϦϞίϯͳ͘ɺͯ͢ Home Controller͔Βૢ࡞ɻ ੈքதɺͲ͜ͷNOT A HOTELʹߦͬͯɺ ·ΔͰࣗͷΑ͏ʹ໎͏͜ͱͳͯ͘͢ͷػ ثͷૢ࡞͕Ͱ͖·͢ɻ Home Controller
Speech Framework • iOS ඪ४ͷԻೝࣝ Framework • ԻࡁΈͷԻϑΝΠϧͷೝࣝ, ϚΠΫೖྗΛར༻ͨ͠ϦΞϧλΠϜͳ Իͷೝ͕ࣝՄೳ
• ຊޠαϙʔτ • iOS 10.0 Ҏ߱Ͱར༻Մೳ
Speech Framework ΛͬͨԻೝࣝͷ࠷খ࣮
Իೝࣝͷதؒ݁Ռͷऔಘ • SFSpeechRecognitionRequest ͷ shouldReportPartialResults ϓϩύςΟͰɺԻೝࣝͷதؒ݁ՌΛऔಘ͢Δɾऔಘ͠ͳ͍ͷ ੍ޚ͕Մೳ • σϑΥϧτ true
( தؒ݁ՌΛऔಘ͢Δ ) • a • Իೝࣝͷ࠷ऴతͳ݁Ռ͚͕ͩඞཁͳ߹ɺ shouldReportPartialResults Λfalse ʹઃఆ͢Δ
Իೝࣝͷதؒ݁Ռऔಘͷ༗ແʹΑΔڍಈͷҧ͍
ΦϯσόΠεͰͷԻೝࣝ • ϓϥΠόγʔΛߟྀͯ͠ɺΦϯσόΠεͰԻೝࣝΛͤ͞Δ͜ͱ͕Ͱ͖Δ • ωοτϫʔΫʹܨ͕ͣͱԻೝ͕ࣝՄೳ • ͨͩ͠ɺServer ϕʔεͷԻೝࣝͱൺֱͯ͠ਫ਼͕ߴ͘ͳ͍ • Server
ϕʔεͷԻೝࣝɺԻೝࣝͷ࠷େ࣌ؒͷ੍ݶ, Ұ͋ͨΓͷճ੍ݶ͕͋Δͱ͍͏σϝϦοτ͕͋Δ • SFSpeechRecognitionRequest ͷ requiresOnDeviceRecognition ϓϩύςΟͰɺ༗ޮԽͰ͖Δ • a
ݴޠϞσϧͷΧελϚΠζ • ԻೝࣝͷݴޠϞσϧΛΧελϚΠζ͢Δ͜ͱͰɺಛఆͷϢʔεέʔε ͚ʹೝࣝਫ਼Λ্͛Δ͜ͱ͕Ͱ͖Δ • iOS 17.0 Ҏ߱Ͱར༻Մೳ • ΧελϚΠζͨ͠ݴޠϞσϧΛར༻͢ΔʹɺΦϯσόΠεͰԻೝࣝ
ͤ͞Δඞཁ͕͋Δ • a
ΧελϚΠζͨ͠ݴޠϞσϧͷ࡞
ΧελϚΠζͨ͠ݴޠϞσϧͷ࡞ • PhraseCount ΦϒδΣΫτΛར༻͢Δ͜ͱͰɺਖ਼֬ͳϑϨʔζΛ ΧελϜϞσϧʹొ͢Δ͜ͱ͕Ͱ͖Δ • ಛఆͷϑϨʔζΛೝࣝͤ͘͢͢͞Δ •
ΧελϚΠζͨ͠ݴޠϞσϧͷ࡞ • ΞϓϦͰઐ༻ޠͳͲҰൠతͰͳ͍୯ޠΛ༻͢Δ߹ɺ༻ޠͷεϖϧ ͱൃԻͷϖΞΛఆٛ͠ɺΧελϜϞσϧʹొ͢Δ͜ͱ͕Ͱ͖Δ • ൃԻ X-SAMPA ܗࣜ •
ΧελϚΠζͨ͠ݴޠϞσϧͷར༻ • ࡞ͨ͠ΧελϜݴޠϞσϧ SFSpeechRecognitionRequest ͷ customizedLanguageModel ϓϩύςΟʹઃఆ͢Δ͜ͱͰར༻Մೳ •
ࢀߟ • Advances in Speech Recognition • https://developer.apple.com/videos/play/wwdc2019/256 • Customize
on-device speech recognition • https://developer.apple.com/videos/play/ wwdc2023/10101 • Recognizing speech in live audio • https://developer.apple.com/documentation/speech/ recognizing-speech-in-live-audio