based on Binaural Inputs By Bagus Tris Atmaja NRP. 2410 201 006 Supervisor: Dr. D. Arifianto, Prof. T. Usagawa Dept. of Engineering Physics Faculty of Industrial Technology
one of the most important sense on human. • The early study on cocktail party problem was elaborated by Helmholtz (1863) • The early study on computational sound separation was proposed by P. Comon (Elsevier Sig. Proc., 1994) • High quality sound separation using Information Maximization (Infomax) was proposed by Bell & Sejnowsky (Neural Com., 1995) • The recent study by Kim. et. al., 2011 proposed a more realistic model on binaural hearing.
separation is not easy task. It mimics how the brain works. (Wang, 2006) • However, current method (Kim, 2011) does not consider the spacing between ears due to spatial aliasing. • This research proposes FastICA with binary mask. • The proposed method was evaluated with the other methods on sources segregation by evaluating coherence criterion and PESQ score.
some methods including proposed method on source separation problem for signal enhancement task based on binaural inputs. • Measure the objective evaluation by means of coherence and PESQ. Applications : • Speech Recognition, Telecommunication, • Hearing Aids, Machine Sound Separation etc.
Known: x m-sensors and s n-sources • ICA can be defined to find s from x only • In this research included noise v to close real problem, if W=A-1 and s'(n)=y(n), then • There are many methods to find W, in this research ICA was used using max likelihood
(ICABM) Binary Mask → Applying the mask as a binary weight matrix to the mixture in the T-F domain Mask → weighting (filtering) the mixture Pro-Con: perceptually good, poor coherence
Signals Pre Processing Processing Remove Mean Remove Mean Whitening Whitening PCA PCA FPICA FPICA In FastICA, separation matrix can be obtained by the following formula : Pro-Con: perceptually poor, good coherence, not yet implemented in binaural hearing
Criterion → how well a signal correlated to other signal at each frequency • PESQ → Perceptual evaluation of speech quality Value/Score: 0 ~ 1 Value/Score: 0.5 ~ 4.5
Various SIR Methods Signal to Interference Ratio (SIR) -20 dB -10 dB 0 dB 10 dB 20 dB ICA 0.598 0.598 0.597 0.633 0.633 ICABM 0.603 0.608 0.394 0.325 0.325 PDCW 0.513 0.500 0.409 0.213 0.213 FastICA 0.598 0.598 0.597 0.632 0.632 FastICABM 0.631 0.609 0.315 0.418 0.471 Result based on Coherence Criterion
Various SIR Result based on PESQ Score Methods Signal to Interference Ratio (SIR) -20 dB -10 dB 0 dB 10 dB 20 dB ICA 1.180 1.180 1.184 1.378 1.378 ICABM 1.185 2.077 1.548 0.692 0.692 PDCW 1.169 1.167 1.190 0.991 0.991 FastICA 1.180 1.180 1.184 1.379 1.379 FastICABM 1.268 2.112 1.282 0.935 1.268
can be separated by using some, in this research we use ICA, ICABM, PDCW, and FastICA. We propose FastICA with binary mask to solve the lack of ICABM and FastICA. This method perform best in different SIR of -20 dB and -10 dB. Those data included noise. • Coherence criterion and PESQ score were used to evaluate separation result. Coherence was good to extract characteristic of estimated signal while PESQ suitable for perceptual application purpose.
P.862, “Perceptual evaluation of speech quality (pesq): An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs,” 2001. • D. Wang and G. J. Brown, eds., Computatinal Auditory Scene Analysis: Principles, Algorithms and Application. John Wiley and Sons. • C. Kim, K. Kumar, B. Raj, , and R. M. Stern, “Signal separation for robust speech recognition based on phase difference information obtained in the frequency domain,” INTERSPEECH, pp. 2495–2498, 2009. • A. Hyvarinen and E. Oja, “Independent component analysis: Algorithms and applications,” Neural Networks, vol. 13(4-5), pp. 411–430, 2000. • A. Hyvarinen, “Independent component analysis,” vol. 2, pp. 94–128, 2001. • M. S. Pedersen, D. Wang, J. Larsen, and U. Kjems, “Two-microphone separation of speech mixtures,” IEEE TRANSACTIONS ON NEURAL NETWORKS, vol. 19(3), pp. 475–492, 2008. • B. T. Atmaja, T. Usagawa, Y. Chisaki, and D. Arifianto, “On performance of sound separation methods including binaural processors,” in Student meeting of Acoustic Society of Japan, Kyushu-Chapter, 2011. • A. Hyvarinen, “Fast and robust fixed-point algorithms for independent component analysis,” IEEE Trans. on Neural Networks, vol. 10(03), pp. 626–634,1999. • Etc.