Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Speech Frameworkを使った音声認識の基本
Search
Masashi
January 29, 2025
Programming
0
34
Speech Frameworkを使った音声認識の基本
Mobile勉強会 ウォンテッドリー × チームラボ × Sansan #18 〜新技術の導入〜
https://sansan.connpass.com/event/338706/
Masashi
January 29, 2025
Tweet
Share
More Decks by Masashi
See All by Masashi
Eight iOSを支えるアーキテクチャ
kawabe
1
610
これだけは伝えたい設計の技術
kawabe
0
1.2k
EightのUI Component化の取り組み
kawabe
0
130
Other Decks in Programming
See All in Programming
WindowInsetsだってテストしたい
ryunen344
1
170
来たるべき 8.0 に備えて React 19 新機能と React Router 固有機能の取捨選択とすり合わせを考える
oukayuka
2
660
C++20 射影変換
faithandbrave
0
470
生成AIで日々のエラー調査を進めたい
yuyaabo
0
570
Benchmark
sysong
0
180
Enterprise Web App. Development (2): Version Control Tool Training Ver. 5.1
knakagawa
1
110
複数アプリケーションを育てていくための共通化戦略
irof
10
3.9k
アンドパッドの Go 勉強会「 gopher 会」とその内容の紹介
andpad
0
210
[初登壇@jAZUG]アプリ開発者が気になるGoogleCloud/Azure+wasm/wasi
asaringo
0
130
2度もゼロから書き直して、やっとブラウザでぬるぬる動くAIに辿り着いた話
tomoino
0
160
機械学習って何? 5分で解説頑張ってみる
kuroneko2828
0
210
「ElixirでIoT!!」のこれまでとこれから
takasehideki
0
360
Featured
See All Featured
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.3k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
31
1.2k
How to Ace a Technical Interview
jacobian
276
23k
How to Think Like a Performance Engineer
csswizardry
24
1.7k
Why Our Code Smells
bkeepers
PRO
337
57k
The Invisible Side of Design
smashingmag
299
51k
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
Producing Creativity
orderedlist
PRO
346
40k
Code Reviewing Like a Champion
maltzj
524
40k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
248
1.3M
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
123
52k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
130
19k
Transcript
Speech FrameworkΛͬͨ Իೝࣝͷجຊ
Տล խ࢙ɹMasashi Kawabe NOT A HOTEL גࣜձࣾ Smart Home νʔϜ
ιϑτΣΞΤϯδχΞ
ࣨʹεΠονϦϞίϯͳ͘ɺͯ͢ Home Controller͔Βૢ࡞ɻ ੈքதɺͲ͜ͷNOT A HOTELʹߦͬͯɺ ·ΔͰࣗͷΑ͏ʹ໎͏͜ͱͳͯ͘͢ͷػ ثͷૢ࡞͕Ͱ͖·͢ɻ Home Controller
Speech Framework • iOS ඪ४ͷԻೝࣝ Framework • ԻࡁΈͷԻϑΝΠϧͷೝࣝ, ϚΠΫೖྗΛར༻ͨ͠ϦΞϧλΠϜͳ Իͷೝ͕ࣝՄೳ
• ຊޠαϙʔτ • iOS 10.0 Ҏ߱Ͱར༻Մೳ
Speech Framework ΛͬͨԻೝࣝͷ࠷খ࣮
Իೝࣝͷதؒ݁Ռͷऔಘ • SFSpeechRecognitionRequest ͷ shouldReportPartialResults ϓϩύςΟͰɺԻೝࣝͷதؒ݁ՌΛऔಘ͢Δɾऔಘ͠ͳ͍ͷ ੍ޚ͕Մೳ • σϑΥϧτ true
( தؒ݁ՌΛऔಘ͢Δ ) • a • Իೝࣝͷ࠷ऴతͳ݁Ռ͚͕ͩඞཁͳ߹ɺ shouldReportPartialResults Λfalse ʹઃఆ͢Δ
Իೝࣝͷதؒ݁Ռऔಘͷ༗ແʹΑΔڍಈͷҧ͍
ΦϯσόΠεͰͷԻೝࣝ • ϓϥΠόγʔΛߟྀͯ͠ɺΦϯσόΠεͰԻೝࣝΛͤ͞Δ͜ͱ͕Ͱ͖Δ • ωοτϫʔΫʹܨ͕ͣͱԻೝ͕ࣝՄೳ • ͨͩ͠ɺServer ϕʔεͷԻೝࣝͱൺֱͯ͠ਫ਼͕ߴ͘ͳ͍ • Server
ϕʔεͷԻೝࣝɺԻೝࣝͷ࠷େ࣌ؒͷ੍ݶ, Ұ͋ͨΓͷճ੍ݶ͕͋Δͱ͍͏σϝϦοτ͕͋Δ • SFSpeechRecognitionRequest ͷ requiresOnDeviceRecognition ϓϩύςΟͰɺ༗ޮԽͰ͖Δ • a
ݴޠϞσϧͷΧελϚΠζ • ԻೝࣝͷݴޠϞσϧΛΧελϚΠζ͢Δ͜ͱͰɺಛఆͷϢʔεέʔε ͚ʹೝࣝਫ਼Λ্͛Δ͜ͱ͕Ͱ͖Δ • iOS 17.0 Ҏ߱Ͱར༻Մೳ • ΧελϚΠζͨ͠ݴޠϞσϧΛར༻͢ΔʹɺΦϯσόΠεͰԻೝࣝ
ͤ͞Δඞཁ͕͋Δ • a
ΧελϚΠζͨ͠ݴޠϞσϧͷ࡞
ΧελϚΠζͨ͠ݴޠϞσϧͷ࡞ • PhraseCount ΦϒδΣΫτΛར༻͢Δ͜ͱͰɺਖ਼֬ͳϑϨʔζΛ ΧελϜϞσϧʹొ͢Δ͜ͱ͕Ͱ͖Δ • ಛఆͷϑϨʔζΛೝࣝͤ͘͢͢͞Δ •
ΧελϚΠζͨ͠ݴޠϞσϧͷ࡞ • ΞϓϦͰઐ༻ޠͳͲҰൠతͰͳ͍୯ޠΛ༻͢Δ߹ɺ༻ޠͷεϖϧ ͱൃԻͷϖΞΛఆٛ͠ɺΧελϜϞσϧʹొ͢Δ͜ͱ͕Ͱ͖Δ • ൃԻ X-SAMPA ܗࣜ •
ΧελϚΠζͨ͠ݴޠϞσϧͷར༻ • ࡞ͨ͠ΧελϜݴޠϞσϧ SFSpeechRecognitionRequest ͷ customizedLanguageModel ϓϩύςΟʹઃఆ͢Δ͜ͱͰར༻Մೳ •
ࢀߟ • Advances in Speech Recognition • https://developer.apple.com/videos/play/wwdc2019/256 • Customize
on-device speech recognition • https://developer.apple.com/videos/play/ wwdc2023/10101 • Recognizing speech in live audio • https://developer.apple.com/documentation/speech/ recognizing-speech-in-live-audio