Comparing phoneme and feature based speech recognition using artificial neural networks
1992 (English)Conference paper (Refereed)
An artificial neural network has been trained by the error backpropagation technique to recognise phonemes and words. The speech material was recorded by a male Swedish talker and was labelled by a phonetician. There were 38 output nodes corresponding to Swedish phonemes. The training algorithm was somewhat modified to increase the training speed. Introducing coarticulation information by adding simple recurrency to the net is shown to more effective than expanding the size of the input spectral window. The phoneme recognition network was used with dynamic programming for time alignment to recognise connected digits. It was compared to a similar recogniser based on nine quasi-phonetic features instead of 38 phonemes. The phoneme based system performed better than the feature based one. I.
Place, publisher, year, edition, pages
1992. 1279-1282 p.
Computer and Information Science
IdentifiersURN: urn:nbn:se:kth:diva-91464OAI: oai:DiVA.org:kth-91464DiVA: diva2:510363
Proceedings ICSLP 92
NR 201408052012-03-152012-03-15Bibliographically approved