Change search
ReferencesLink to record
Permanent link

Direct link
Creating unseen triphones by phone concatenation in the spectral, cepstral and formant domains
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
1997 (English)Conference paper (Refereed)
Abstract [en]

A technique for predicting triphones by concatenation of diphone or monophone models is studied. The models are connected using linear interpolation between endpoints of piece-wise linear parameter trajectories. Three types of spectral representation are compared: formants, filter amplitudes and cepstmm coefficients. The proposed technique lowers the spectral distortion of the phones for all three representations when different speakers are used for training and evaluation. The average error of the created triphones is lower in the filter and cepstmm domains than for formants. This is explained to be caused by limitations in the Analysis-bySynthesis formant tracking algorithm. A small improvement with the proposed technique is achieved for all representations in the task of reordering N-best sentence recognition candidate lists.

Place, publisher, year, edition, pages
1997. 41-44 p.
National Category
Computer and Information Science
URN: urn:nbn:se:kth:diva-91231OAI: diva2:508924
Proc of Fonetik -97, Dept of Phonetics, Umeå Univ
NR 20140805Available from: 2012-03-11 Created: 2012-03-11Bibliographically approved

Open Access in DiVA

No full text

Search in DiVA

By author/editor
Blomberg, Mats
By organisation
Speech, Music and Hearing, TMH
Computer and Information Science

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 17 hits
ReferencesLink to record
Permanent link

Direct link