Dynamic behaviour of connectionist speech recognition with strong latency constraints
2006 (English)In: Speech Communication, ISSN 0167-6393, Vol. 48, no 7, 802-818 p.Article in journal (Refereed) Published
This paper describes the use of connectionist techniques in phonetic speech recognition with strong latency constraints. The constraints are imposed by the task of deriving the lip movements of a synthetic face in real time from the speech signal, by feeding the phonetic string into an articulatory synthesiser. Particular attention has been paid to analysing the interaction between the time evolution model learnt by the multi-layer perceptrons and the transition model imposed by the Viterbi decoder, in different latency conditions. Two experiments were conducted in which the time dependencies in the language model (LM) were controlled by a parameter. The results show a strong interaction between the three factors involved, namely the neural network topology, the length of time dependencies in the LM and the decoder latency.
Place, publisher, year, edition, pages
Elsevier, 2006. Vol. 48, no 7, 802-818 p.
speech recognition; neural network; low latency; non-linear dynamics
Fluid Mechanics and Acoustics Computer Science Specific Languages
IdentifiersURN: urn:nbn:se:kth:diva-6151DOI: 10.1016/j.specom.2005.05.005ISI: 000239178600004ScopusID: 2-s2.0-33745001617OAI: oai:DiVA.org:kth-6151DiVA: diva2:10780
Research Workshop on Non-Linear Speech Processing (NOLISP),Le Croisic, FRANCE, MAY 20-23, 2003
QC 201006302006-09-212006-09-212013-09-12Bibliographically approved