Super-Dirichlet Mixture Models using Differential Line Spectral Frequencies for Text-Independent Speaker Identification
2011 (English)In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2011, 2360-2363 p.Conference paper (Refereed)
A new text-independent speaker identification (SI) system is proposed. This system utilizes the line spectral frequencies (LSFs) as alternative feature set for capturing the speaker char.: acteristics. The boundary and ordering properties of the LSFs are considered and the LSF are transformed to the differential LSF (DLSF) space. Since the dynamic information is useful for speaker recognition, we represent the dynamic information of the DLSFs by considering two neighbors of the current frame, one from the past frames and the other from the following frames. The current frame with the neighbor frames together are cascaded into a supervector. The statistical distribution of this supervector is modelled by the so-called super-Dirichlet mixture model, which is an extension from the Dirichlet mixture model. Compared to the conventional SI system, which is using the mel-frequency cepstral coefficients and based on the Gaussian mixture model, the proposed SI system shows a promising improvement.
Place, publisher, year, edition, pages
2011. 2360-2363 p.
Speaker recognition, differential line spectral frequencies, super-Dirichlet variable, mixture models
Language Technology (Computational Linguistics)
IdentifiersURN: urn:nbn:se:kth:diva-138449ISI: 000316502201079ScopusID: 2-s2.0-84865735959ISBN: 978-1-61839-270-1OAI: oai:DiVA.org:kth-138449DiVA: diva2:687824
12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011; Florence; Italy; 27 August 2011 through 31 August 2011
QC 201401152014-01-152013-12-192014-01-15Bibliographically approved