Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Speech Frameworkを使った音声認識の基本
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Masashi
January 29, 2025
Programming
78
0
Share
Speech Frameworkを使った音声認識の基本
Mobile勉強会 ウォンテッドリー × チームラボ × Sansan #18 〜新技術の導入〜
https://sansan.connpass.com/event/338706/
Masashi
January 29, 2025
More Decks by Masashi
See All by Masashi
SpeechAnalyzerによるSpeech to Textの進化を探る
kawabe
0
11
Eight iOSを支えるアーキテクチャ
kawabe
1
630
これだけは伝えたい設計の技術
kawabe
0
1.3k
EightのUI Component化の取り組み
kawabe
0
140
Other Decks in Programming
See All in Programming
事業会社でのセキュリティ長期インターンについて
masachikaura
0
240
「効かない!」依存性注入(DI)を活用したAPI Platformのエラーハンドリング奮闘記
mkmk884
0
320
おれのAgentic Coding 2026/03
tsukasagr
1
140
AI時代の脳疲弊と向き合う ~言語学としてのPHP~
sakuraikotone
1
1.8k
Vibe하게 만드는 Flutter GenUI App With ADK , 박제창, BWAI Incheon 2026
itsmedreamwalker
0
550
Linux Kernelの1文字のミスで 権限昇格ができた話
rqda
0
2.3k
煩雑なSkills管理をSoC(関心の分離)により解決する――関心を分離し、プロンプトを部品として育てるためのOSSを作った話 / Solving Complex Skills Management Through SoC (Separation of Concerns)
nrslib
4
810
Codex CLIのSubagentsによる並列API実装 / Parallel API Implementation with Codex CLI Subagents
takatty
2
880
CDK Deployのための ”反響定位”
watany
4
660
Go_College_最終発表資料__外部公開用_.pdf
xe_pc23
0
160
2026-03-27 #terminalnight 変数展開とコマンド展開でターミナル作業をスマートにする方法
masasuzu
0
320
L’IA au service des devs : Anatomie d'un assistant de Code Review
toham
0
220
Featured
See All Featured
Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]
tammyeverts
11
880
How to Build an AI Search Optimization Roadmap - Criteria and Steps to Take #SEOIRL
aleyda
1
2k
Leveraging LLMs for student feedback in introductory data science courses - posit::conf(2025)
minecr
1
220
From Legacy to Launchpad: Building Startup-Ready Communities
dugsong
0
190
Faster Mobile Websites
deanohume
310
31k
How To Stay Up To Date on Web Technology
chriscoyier
790
250k
30 Presentation Tips
portentint
PRO
1
270
Bioeconomy Workshop: Dr. Julius Ecuru, Opportunities for a Bioeconomy in West Africa
akademiya2063
PRO
1
93
Design of three-dimensional binary manipulators for pick-and-place task avoiding obstacles (IECON2024)
konakalab
0
400
The untapped power of vector embeddings
frankvandijk
2
1.7k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
52
5.9k
KATA
mclloyd
PRO
35
15k
Transcript
Speech FrameworkΛͬͨ Իೝࣝͷجຊ
Տล խ࢙ɹMasashi Kawabe NOT A HOTEL גࣜձࣾ Smart Home νʔϜ
ιϑτΣΞΤϯδχΞ
ࣨʹεΠονϦϞίϯͳ͘ɺͯ͢ Home Controller͔Βૢ࡞ɻ ੈքதɺͲ͜ͷNOT A HOTELʹߦͬͯɺ ·ΔͰࣗͷΑ͏ʹ໎͏͜ͱͳͯ͘͢ͷػ ثͷૢ࡞͕Ͱ͖·͢ɻ Home Controller
Speech Framework • iOS ඪ४ͷԻೝࣝ Framework • ԻࡁΈͷԻϑΝΠϧͷೝࣝ, ϚΠΫೖྗΛར༻ͨ͠ϦΞϧλΠϜͳ Իͷೝ͕ࣝՄೳ
• ຊޠαϙʔτ • iOS 10.0 Ҏ߱Ͱར༻Մೳ
Speech Framework ΛͬͨԻೝࣝͷ࠷খ࣮
Իೝࣝͷதؒ݁Ռͷऔಘ • SFSpeechRecognitionRequest ͷ shouldReportPartialResults ϓϩύςΟͰɺԻೝࣝͷதؒ݁ՌΛऔಘ͢Δɾऔಘ͠ͳ͍ͷ ੍ޚ͕Մೳ • σϑΥϧτ true
( தؒ݁ՌΛऔಘ͢Δ ) • a • Իೝࣝͷ࠷ऴతͳ݁Ռ͚͕ͩඞཁͳ߹ɺ shouldReportPartialResults Λfalse ʹઃఆ͢Δ
Իೝࣝͷதؒ݁Ռऔಘͷ༗ແʹΑΔڍಈͷҧ͍
ΦϯσόΠεͰͷԻೝࣝ • ϓϥΠόγʔΛߟྀͯ͠ɺΦϯσόΠεͰԻೝࣝΛͤ͞Δ͜ͱ͕Ͱ͖Δ • ωοτϫʔΫʹܨ͕ͣͱԻೝ͕ࣝՄೳ • ͨͩ͠ɺServer ϕʔεͷԻೝࣝͱൺֱͯ͠ਫ਼͕ߴ͘ͳ͍ • Server
ϕʔεͷԻೝࣝɺԻೝࣝͷ࠷େ࣌ؒͷ੍ݶ, Ұ͋ͨΓͷճ੍ݶ͕͋Δͱ͍͏σϝϦοτ͕͋Δ • SFSpeechRecognitionRequest ͷ requiresOnDeviceRecognition ϓϩύςΟͰɺ༗ޮԽͰ͖Δ • a
ݴޠϞσϧͷΧελϚΠζ • ԻೝࣝͷݴޠϞσϧΛΧελϚΠζ͢Δ͜ͱͰɺಛఆͷϢʔεέʔε ͚ʹೝࣝਫ਼Λ্͛Δ͜ͱ͕Ͱ͖Δ • iOS 17.0 Ҏ߱Ͱར༻Մೳ • ΧελϚΠζͨ͠ݴޠϞσϧΛར༻͢ΔʹɺΦϯσόΠεͰԻೝࣝ
ͤ͞Δඞཁ͕͋Δ • a
ΧελϚΠζͨ͠ݴޠϞσϧͷ࡞
ΧελϚΠζͨ͠ݴޠϞσϧͷ࡞ • PhraseCount ΦϒδΣΫτΛར༻͢Δ͜ͱͰɺਖ਼֬ͳϑϨʔζΛ ΧελϜϞσϧʹొ͢Δ͜ͱ͕Ͱ͖Δ • ಛఆͷϑϨʔζΛೝࣝͤ͘͢͢͞Δ •
ΧελϚΠζͨ͠ݴޠϞσϧͷ࡞ • ΞϓϦͰઐ༻ޠͳͲҰൠతͰͳ͍୯ޠΛ༻͢Δ߹ɺ༻ޠͷεϖϧ ͱൃԻͷϖΞΛఆٛ͠ɺΧελϜϞσϧʹొ͢Δ͜ͱ͕Ͱ͖Δ • ൃԻ X-SAMPA ܗࣜ •
ΧελϚΠζͨ͠ݴޠϞσϧͷར༻ • ࡞ͨ͠ΧελϜݴޠϞσϧ SFSpeechRecognitionRequest ͷ customizedLanguageModel ϓϩύςΟʹઃఆ͢Δ͜ͱͰར༻Մೳ •
ࢀߟ • Advances in Speech Recognition • https://developer.apple.com/videos/play/wwdc2019/256 • Customize
on-device speech recognition • https://developer.apple.com/videos/play/ wwdc2023/10101 • Recognizing speech in live audio • https://developer.apple.com/documentation/speech/ recognizing-speech-in-live-audio