Change search
ReferencesLink to record
Permanent link

Direct link
Modelling Speech Line Spectral Frequencies with Dirichlet Mixture Models
KTH, School of Electrical Engineering (EES), Sound and Image Processing (Closed 130101).
KTH, School of Electrical Engineering (EES), Sound and Image Processing (Closed 130101).
2010 (English)In: 11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010, 2010, 2370-2373 p.Conference paper (Refereed)
Abstract [en]

In this paper, we model the underlying probability density function(PDF) of the speech line spectral frequencies (LSF) parameterswith a Dirichlet mixture model (DMM). The LSF parametershave two special features: 1) the LSF parameters havea bounded range; 2) the LSF parameters are in an increasingorder. By transforming the LSF parameters to the ΔLSF parameters,the DMM can be used to model the ΔLSF parametersand take advantage of the features mentioned above. Thedistortion-rate (D-R) relation is derived for the Dirichlet distributionwith the high rate assumption. A bit allocation strategyfor DMM is also proposed. In modelling the LSF parametersextracted from the TIMIT database, the DMM shows a betterperformance compared to the Gaussian mixture model, in termsof D-R relation, likelihood and model complexity. Since modellingis the essential and prerequisite step in the PDF-optimizedvector quantizer design, better modelling results indicate a superiorquantization performance.

Place, publisher, year, edition, pages
2010. 2370-2373 p.
Keyword [en]
speech coding, line spectral frequencies, mixture models, Dirichlet distribution
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering Computer Science
URN: urn:nbn:se:kth:diva-33679ISI: 000313086500205ScopusID: 2-s2.0-79959816308ISBN: 978-1-61782-123-3OAI: diva2:416997
11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. Makuhari, Chiba. 26 September 2010 - 30 September 2010

QC 20111118

Available from: 2011-05-13 Created: 2011-05-13 Last updated: 2014-01-09Bibliographically approved
In thesis
1. Non-Gaussian Statistical Modelsand Their Applications
Open this publication in new window or tab >>Non-Gaussian Statistical Modelsand Their Applications
2011 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Statistical modeling plays an important role in various research areas. It provides away to connect the data with the statistics. Based on the statistical properties of theobserved data, an appropriate model can be chosen that leads to a promising practicalperformance. The Gaussian distribution is the most popular and dominant probabilitydistribution used in statistics, since it has an analytically tractable Probability DensityFunction (PDF) and analysis based on it can be derived in an explicit form. However,various data in real applications have bounded support or semi-bounded support. As the support of the Gaussian distribution is unbounded, such type of data is obviously notGaussian distributed. Thus we can apply some non-Gaussian distributions, e.g., the betadistribution, the Dirichlet distribution, to model the distribution of this type of data.The choice of a suitable distribution is favorable for modeling efficiency. Furthermore,the practical performance based on the statistical model can also be improved by a bettermodeling.

An essential part in statistical modeling is to estimate the values of the parametersin the distribution or to estimate the distribution of the parameters, if we consider themas random variables. Unlike the Gaussian distribution or the corresponding GaussianMixture Model (GMM), a non-Gaussian distribution or a mixture of non-Gaussian dis-tributions does not have an analytically tractable solution, in general. In this dissertation,we study several estimation methods for the non-Gaussian distributions. For the Maxi-mum Likelihood (ML) estimation, a numerical method is utilized to search for the optimalsolution in the estimation of Dirichlet Mixture Model (DMM). For the Bayesian analysis,we utilize some approximations to derive an analytically tractable solution to approxi-mate the distribution of the parameters. The Variational Inference (VI) framework basedmethod has been shown to be efficient for approximating the parameter distribution byseveral researchers. Under this framework, we adapt the conventional Factorized Approx-imation (FA) method to the Extended Factorized Approximation (EFA) method and useit to approximate the parameter distribution in the beta distribution. Also, the LocalVariational Inference (LVI) method is applied to approximate the predictive distributionof the beta distribution. Finally, by assigning a beta distribution to each element in thematrix, we proposed a variational Bayesian Nonnegative Matrix Factorization (NMF) forbounded support data.

The performances of the proposed non-Gaussian model based methods are evaluatedby several experiments. The beta distribution and the Dirichlet distribution are appliedto model the Line Spectral Frequency (LSF) representation of the Linear Prediction (LP)model for statistical model based speech coding. For some image processing applications,the beta distribution is also applied. The proposed beta distribution based variationalBayesian NMF is applied for image restoration and collaborative filtering. Comparedto some conventional statistical model based methods, the non-Gaussian model basedmethods show a promising improvement.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2011. xii, 49 p.
Trita-EE, ISSN 1653-5146
National Category
Telecommunications Computer and Information Science
urn:nbn:se:kth:diva-47408 (URN)978-91-7501-158-5 (ISBN)
Public defence
2011-12-05, E1, Lindstedsvägen 3, KTH, Stockholm, 09:00 (English)
QC 20111115Available from: 2011-11-15 Created: 2011-11-08 Last updated: 2011-11-15Bibliographically approved

Open Access in DiVA

No full text

Other links


Search in DiVA

By author/editor
Ma, ZhanyuLeijon, Arne
By organisation
Sound and Image Processing (Closed 130101)
Other Electrical Engineering, Electronic Engineering, Information EngineeringComputer Science

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 112 hits
ReferencesLink to record
Permanent link

Direct link