Nonlinear Frequency Warp for Speech Recognition
1986 (English)Conference paper (Refereed)
A technique of nonlinear frequency warping has been investigated for recognition of Swedish vowels. A frequency warp between two spectra is computed using a standard dynamic programming algorithm. The frequency distance, defined as the area between the obtained warping function and the diagonal, is contributing to the spectral distance. The distance between two spectra is a weighted sum of the warped amplitude distance and the frequency distance. By changing two weights, we get a gradual shift between non-warped amplitude distance, warped amplitude distance, and frequency distance. In recognition experiments on natural and synthetic vowel spectra, a metric combining the frequency and amplitude distances gave better results than using only amplitude or frequency deviation. Analysis of the results of the synthetic vowels show a reduced sensitivity to voice source and pitch variation. For the natural vowels, the recognition improvement is larger for the male and female speakers separately than for the combined groups.
Place, publisher, year, edition, pages
1986. 2631-2634 p.
Computer and Information Science
IdentifiersURN: urn:nbn:se:kth:diva-93606OAI: oai:DiVA.org:kth-93606DiVA: diva2:517149
Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '86.
NR 201408052012-04-212012-04-21Bibliographically approved