Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
音声情報処理に便利な (Python) パッケージやソフトウェア
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Akira Tamamori
December 30, 2020
Research
950
3
Share
音声情報処理に便利な (Python) パッケージやソフトウェア
Tokyo BISH Bashの資料から独立させたもの
Akira Tamamori
December 30, 2020
More Decks by Akira Tamamori
See All by Akira Tamamori
音声認識と音声合成の超入門
tam17aki
0
520
Tokyo BISH Bash #02 音声情報処理と音声変換技術入門
tam17aki
2
2.3k
[ICASSP2020音響音声読み会] State-Space Gaussian Process for Drift Estimation in Stochastic Differential Equations
tam17aki
0
580
Other Decks in Research
See All in Research
Unified Audio Source Separation (Defense Slides)
kohei_1979
1
600
ペットのかわいい瞬間を撮影する オートシャッターAIアプリへの スマートラベリングの適用
mssmkmr
0
480
Φ-Sat-2のAutoEncoderによる情報圧縮系論文
satai
4
600
2026-01-30-MandSL-textbook-jp-cos-lod
yegusa
1
1.2k
2026年1月の生成AI領域の重要リリース&トピック解説
kajikent
0
1k
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
shunk031
4
920
羽田新ルート運用6年の検証
1manken
0
140
はじまりの クエスチョンブック —余暇と豊かさにあふれた社会とは?
culturaltransition
PRO
0
410
データサイエンティストの業務変化
datascientistsociety
PRO
0
400
ScoreMatchingRiesz for Automatic Debiased Machine Learning and Policy Path Estimation with an Application to Japanese Monetary Policy Evaluation
masakat0
0
270
明日から使える!研究効率化ツール入門
matsui_528
12
7k
2026年3月1日(日)福島「除染土」の公共利用をかんがえる
atsukomasano2026
0
580
Featured
See All Featured
Tips & Tricks on How to Get Your First Job In Tech
honzajavorek
1
500
Thoughts on Productivity
jonyablonski
76
5.1k
The Director’s Chair: Orchestrating AI for Truly Effective Learning
tmiket
1
160
Hiding What from Whom? A Critical Review of the History of Programming languages for Music
tomoyanonymous
2
800
Build The Right Thing And Hit Your Dates
maggiecrowley
39
3.1k
How to build a perfect <img>
jonoalderson
1
5.5k
Joys of Absence: A Defence of Solitary Play
codingconduct
1
360
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.4k
Darren the Foodie - Storyboard
khoart
PRO
3
3.3k
Lightning Talk: Beautiful Slides for Beginners
inesmontani
PRO
1
540
Into the Great Unknown - MozCon
thekraken
41
2.5k
My Coaching Mixtape
mlcsv
0
120
Transcript
ԻใॲཧԻมʹ ศརͳ (Python) ύοέʔδ ιϑτΣΞͨͪ 1 ༻ײͳͲࢲݟΛؚΈ·͢
TPYQZTPY • ίϚϯυϥΠϯ͔ΒϑΥʔϚοτมͳͲΛ͓खܰʹ • ϑΥʔϚοτมʢwav to mp3 ͳͲʣ • ݁߹ϛοΫεɺτϦϛϯά
Մೳ • όονॲཧָʢγΣϧεΫϦϓτͳͲʣ • Pythonϥούʔ pysox͋Δ • Πϯετʔϧ • brew install sox ͳͲ • pip install sox ← pysox ͷΠϯετʔϧ͜Ε 2
MJCSPTBʢ͜ΕຊʹΦεεϝʣ • Ի/ԻָͷੳʹศརͳϞδϡʔϧ͕ଗͬͨύοέʔδ • ެࣜϚχϡΞϧɾνϡʔτϦΞϧͷॆ࣮ॿ͔Δ • ݸਓతʹΑ͘͏ػೳ • ܗදࣔɺεϖΫτϩάϥϜදࣔ •
Իಛྔநग़ʢରϝϧεϖΫτϩάϥϜʣ • Πϯετʔϧ pip install librosa • ެࣜϖʔδ https://librosa.org/librosa/index.html 3
1Z8PSME • Իͷੳ࠶߹Λߦ͏Ϙίʔμʔͷύοέʔδ • ԻΛʮ৭ɾͷߴ͞ɾͷ͔͢Εʯͷ֤ʹղ͠࠶߹ • C++൛ͷPythonϥούʔ • Իͷಛྔநग़ʹ͑ͯศར ⇒
PySPTKʢޙड़ʣΑΓ࣭ͷΑ͍εϖΫτϧแབྷ • Πϯετʔϧ pip install pyworld • ެࣜϖʔδ https://github.com/JeremyCCHsu/Python-Wrapper-for-World-Vocoder 4
1Z"VEJP • ετϦʔϜԻ / ࠶ੜʹศརͳύοέʔδ • ϦΞϧλΠϜͷԻೖྗɾԻग़ྗʹ͑Δ • ϦΞϧλΠϜԻม with
PythonͳͲՄೳ • Πϯετʔϧ • pip install pyaudio ※ཁportaudio (e.g., brew install portaudio) 5
1Z"VEJPͱ1Z8PSMEͷΈ߹Θͤ • ؆қ൛ͷϘΠενΣϯδϟʔ • ؆қϘΠενΣϯδϟʔͷεΫϦϓτΛվྑɿPyQt5ͷεϥΠμʔʹΑΓ ϐονͱϑΥϧϚϯτΛϦΞϧλΠϜௐ͢ΔػೳΛՃʢฐϒϩάʣ • banibiku • Zoom৴͚ʹ̎࣍ݩΩϟϥʹͳΓ͖Δ͜ͱΛࢦͨ͠ϓϩδΣΫτ
• scripts/voice_converter.py ͕ྑ͍ײ͡ͷϘΠενΣϯδϟʔ → ฐϒϩάͷαϯϓϧεΫϦϓτͷόάϑΟοΫεؚ͕·ΕΔ 6 https://tam5917.hatenablog.com/entry/2019/04/30/213321 https://github.com/peisuke/babiniku
1Z415, • ԻใॲཧπʔϧΩοτSPTKͷPythonϥούʔ • SPTKࣗମLinuxίϚϯυ܈ • Իڹಛྔநग़ʹ͏ͷ͕ศར • Իੳ߹Ͱ͖Δ͕ɺ࣭ࣗମWORLDͷ΄͏্͕ •
Πϯετʔϧ pip install pysptk • ެࣜϖʔδ https://pysptk.readthedocs.io/en/latest/ 7
OONOLXJJ <OBOBNJO LBXBJJ> • DNNԻ߹ʹཱͭϞδϡʔϧΛूΊͨύοέʔδ • ͲͪΒ͔ͱ͍͏ͱݚڀ༻్ • લॲཧԻڹಛྔநग़ͷΫϥε͕Ұ௨Γଗ͍ͬͯΔ •
จͷ࠶ݱ࣮Λ͢Δͱ͖ͳͲʹେ͍ʹཱͭ • Πϯετʔϧ pip install nnmnkwii • ެࣜϖʔδ https://r9y9.github.io/nnmnkwii/stable/index.html 8
1ZEVC • Pydub • ܗฤूʹศརͳϞδϡʔϧΛूΊͨύοέʔδ • αϙʔτ͢ΔϑΝΠϧܗࣜ๛ʢwav, mp3, mp4, wma,
aac, ...ʣ • ػೳ Γग़͠ɺׂɺϛοΫεɺϑΣʔυΠϯΞτɺແԻૠೖɺͳͲͳͲ • Ұ෦ͷػೳ pysoxͷ΄͏͕ߴͱ͍͏ӟ?ʢະ֬ೝʣ • Πϯετʔϧ pip install pydub • ެࣜϖʔδ http://pydub.com/ 9
TQSPDLFU • ౷ܭత࣭มͷͨΊͷπʔϧΩοτ (not ύοέʔδ) • ͲͪΒ͔ͱ͍͏ͱݚڀ༻ ʢMITϥΠηϯεʣ • ݚڀͷʮϕʔεϥΠϯʯߏஙʹ࠷ద
• ެࣜϖʔδ https://github.com/k2kobayashi/sprocket • ղઆจ ʰ౷ܭత࣭มιϑτΣΞೖʱ https://www.jstage.jst.go.jp/article/isciesci/62/2/62_69/_article/-char/ja/ • νϡʔτϦΞϧ (εϥΠυ & notebook) https://github.com/kan-bayashi/INTERSPEECH19_TUTORIAL 10
"VEBDJUZ ೖΕ͓ͯ͘ͱ҆৺ • ϑϦʔͷܗฤूιϑτɺϚϧνϓϥοτϑΥʔϜ • ๛ͳαϯυΤϑΣΫτՃػೳ • ެࣜϖʔδ https://www.audacityteam.org/ 11
(16্ͰԻॲཧ͍ͨ͠Ϛϯʹ ͓͢͢Ίͷύοέʔδ 12 ͓·͚
UPSDIBVEJP • Pytorchެ͕ࣜαϙʔτ͍ͯ͠ΔԻॲཧܥϥΠϒϥϦ • PytorchܥͷਂֶशϞσϧͱͷ૬ੑ͕ྑ͍ʢͦΕͦ͏ʣ • ެࣜϖʔδ https://pytorch.org/audio/stable/index.html 13
UGTJHOBM • TensorFlowެ͕ࣜαϙʔτ͍ͯ͠ΔԻॲཧܥͷؔ܈ • TFܥͷਂֶशϞσϧͱͷ૬ੑ͕ྑ͍ʢͦΕͦ͏ʣ • FFT/iFFT, DCT, MDCT, STFTͳͲ
• ެࣜϖʔδ https://www.tensorflow.org/api_docs/python/tf/signal 14
UPSDIMJCSPTB • PytorchΛόοΫΤϯυʹͯ͠librosaΛGPU্Ͱಈ͔͢ • Πϯετʔϧ pip install torchlibrosa • ެࣜϖʔδ
https://github.com/qiuqiangkong/torchlibrosa 15
LBQSF • Kerasʢͱ͍͏͔TFʣΛόοΫΤϯυʹͯ͠Իॲཧ͢Δ • STFTiSTFTɺϝϧεϖΫτϩάϥϜͳͲ • CQTͳͲͳ͍ • ެࣜϖʔδ https://github.com/keunwoochoi/kapre
• Πϯετʔϧ pip install kapre 16
OO"VEJP • PytorchΛόοΫΤϯυʹͯ͠STFTͳͲΛGPU্Ͱಈ͔͢ • STFTɺٯSTFTɺCQTͳͲΑ͘͏ಛநग़ܥ͕ଗ͏ • ެࣜϖʔδ https://github.com/KinWaiCheuk/nnAudio 17
OO"VEJPʢ͖ͭͮʣ • ൺֱද 18