Upgrade to Pro — share decks privately, control downloads, hide ads and more …

drummernet

Zhang Yixiao
January 17, 2020

 drummernet

Zhang Yixiao

January 17, 2020
Tweet

More Decks by Zhang Yixiao

Other Decks in Research

Transcript

  1. 现有的问题 • 缺少大规模的带注释数据 • 解决办法1:使用合成数据 • 解决办法2:使用未标记的数据 • Mark Cartwright

    and Juan Pablo Bello. Increasing drum transcription vocabulary using data synthesis. Proc. of the 21st Int. Conference on Digital Audio Effects (DAFx-18). Aveiro, Portugal, 2018. • Chih-Wei Wu and Alexander Lerch. Automatic drum transcription using the student-teacher learning paradigm with unlabeled music data. In Proc. Int. Soc. Music Inf. Retrieval Conf., pages 613–620, 2017. • 但上面的模型仍然是有监督+师生学习(Teacher-student Learning) • Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
  2. RNN/Sparsemax/Unsampler • 三层GRU • {time-axis, bi-direction, 100 channel} • {time-axis,

    uni-direction, 50 channel} • {instrument-axis, uni-direction, K}, K为鼓乐器数目 • Sparsemax,softmax的“稀疏版本”,允许某项为0 • 一个沿着time-axis的不重叠窗口,一个沿着instrument-axis • 并行计算,结果点乘 • Unsampler • 以0插值,从N/16补回N
  3. Ablation Study • Sparsemax • Softmax效果更差 • Softmax顺序使用比并行差 • 会造成很多假阳性

    • CQT • 用质谱图MEL或短时傅里叶变换STFT会变差 • Onset Enhancement • 不显著的提升,但对训练初期loss下降有好处 • RNNs • 用三个卷积层替换,不产生显著差异。长期关系信息少?