Modelling Speech Line Spectral Frequencies with Dirichlet Mixture Models
2010 (English)In: 11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010, 2010, 2370-2373 p.Conference paper (Refereed)
In this paper, we model the underlying probability density function(PDF) of the speech line spectral frequencies (LSF) parameterswith a Dirichlet mixture model (DMM). The LSF parametershave two special features: 1) the LSF parameters havea bounded range; 2) the LSF parameters are in an increasingorder. By transforming the LSF parameters to the ΔLSF parameters,the DMM can be used to model the ΔLSF parametersand take advantage of the features mentioned above. Thedistortion-rate (D-R) relation is derived for the Dirichlet distributionwith the high rate assumption. A bit allocation strategyfor DMM is also proposed. In modelling the LSF parametersextracted from the TIMIT database, the DMM shows a betterperformance compared to the Gaussian mixture model, in termsof D-R relation, likelihood and model complexity. Since modellingis the essential and prerequisite step in the PDF-optimizedvector quantizer design, better modelling results indicate a superiorquantization performance.
Place, publisher, year, edition, pages
2010. 2370-2373 p.
speech coding, line spectral frequencies, mixture models, Dirichlet distribution
Other Electrical Engineering, Electronic Engineering, Information Engineering Computer Science
IdentifiersURN: urn:nbn:se:kth:diva-33679ISI: 000313086500205ScopusID: 2-s2.0-79959816308ISBN: 978-1-61782-123-3OAI: oai:DiVA.org:kth-33679DiVA: diva2:416997
11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. Makuhari, Chiba. 26 September 2010 - 30 September 2010
QC 201111182011-05-132011-05-132014-01-09Bibliographically approved