Slide 1

Slide 1 text

Sound sources separation based on fundamental frequency estimation for speech and musical signal B.T. Atmaja ([email protected]) D. Arifianto ([email protected])

Slide 2

Slide 2 text

Getaran & Akustik 2014 Today's talk... ● Cocktail Party Problem ● (Blind) Source Separation ● Fundamental Frequency Estimation ● Pitch & Common Amplitude Modulation ● Harmonic Selection ● Speech and Musical Signal ● Result & Future Research

Slide 3

Slide 3 text

Getaran & Akustik 2014 Motivation: Cocktail Party same as x L with different level (ILD) and time arrival (ITD)

Slide 4

Slide 4 text

Getaran & Akustik 2014 Cocktail Party Problem ● One of our most important faculties is our ability to listen to, and follow, one speaker in the presence of others. . . we may call it "the cocktail party problem." No machine has yet been constructed to do just that. – Cherry, E. Colin (1953). "Some Experiments on the Recognition of Speech, with One and with Two Ears". The Journal of the Acoustical Society of America 25 (5): 975–79.

Slide 5

Slide 5 text

Getaran & Akustik 2014 (Blind) Source Separation ● Known: X m-sensors and S n-sources ● Blind Source Separation can be defined to find S from X only (blindly), v(n) is noise(s)

Slide 6

Slide 6 text

Getaran & Akustik 2014 Fundamental Frequency Speech signal in time domain and its harmonics (Dhany Arifianto, 2005)

Slide 7

Slide 7 text

Getaran & Akustik 2014 Multipitch Estimation Block diagram of iterative multipitch estimation based on harmonicity and spectral smoothness (Klapuri, 2003)

Slide 8

Slide 8 text

Getaran & Akustik 2014 Source Separation Process ● Pitch & Common Amplitude Modulation ● Harmonic Selection Multipicth Estimation Mixture sounds S 1 S 2 S 3 ?

Slide 9

Slide 9 text

Getaran & Akustik 2014 Pitch & Common Amplitude Modulation System diagram of PCAM (Y. Li et.al., 2009) ● Harmonics of the same source have correlated amplitude envelopes and that the change in phase of a harmonic is related to the instrument’s pitch ● The amplitude envelopes of different harmonics of the same source tend to be similar, known as common amplitude modulation

Slide 10

Slide 10 text

Getaran & Akustik 2014 Harmonic Selection Block diagram of Separation Process (T.W. Parsons, 1976) By selecting the harmonics of the desired voice in the Fourier transform of the input, separation of mixed speech can be done

Slide 11

Slide 11 text

Getaran & Akustik 2014 Result: Speech signal

Slide 12

Slide 12 text

Getaran & Akustik 2014 Result: Music Signal 0 0.5 1 1.5 2 −0.1 −0.05 0 0.05 0.1 0 0.5 1 1.5 2 −0.5 0 0.5 time (s) Amplitude (a) Sinyal asli (b) Sinyal estimasi

Slide 13

Slide 13 text

Getaran & Akustik 2014 Result: MSE Type of Sinyal PCAM HS Speech Signal 0.0002 0.0042 Musical Signal 0.0039 0.0057

Slide 14

Slide 14 text

Getaran & Akustik 2014 Conclusion ● Mixed sounds can be separated into its source components, in this paper we used pitch and commone amplitude modulation (PCAM) and harmonic selection methods, which resulting PCAM gives better result over harmonic selection method. ● Music sound (tone/multi tone) is easier to be separated than speech, in this paper best result was achieved by using PCAM method with 0.0002 of mse score.

Slide 15

Slide 15 text

Getaran & Akustik 2014 Next... Multipitch Estimation Demodulation

Slide 16

Slide 16 text

Getaran & Akustik 2014 mtu/nuwun\