Voice Transformations For Improving Children's Speech Recognition In A Publicly Available Dialogue System
2002 (English)In: Proceedings of ICSLP 02, 2002Conference paper (Other academic)
To be able to build acoustic models for children, that can beused in spoken dialogue systems, speech data has to be collected. Commercial recognizers available for Swedish are trained on adult speech, which makes them less suitable for children’s computer-directed speech. This paper describes some experiments with on-the-fly voice transformation of children’s speech. Two transformation methods were tested, one inspired by the Phase Vocoder algorithm and another by the Time-Domain Pitch-Synchronous Overlap-Add (TD-PSOLA)algorithm. The speech signal is transformed before being sent to the speech recognizer for adult speech. Our results show that this method reduces the error rates in the order of thirty to fortyfive percent for children users.
Place, publisher, year, edition, pages
Engineering and Technology
IdentifiersURN: urn:nbn:se:kth:diva-13339OAI: oai:DiVA.org:kth-13339DiVA: diva2:323753
International Conference on Spoken Language Processing
QC 201006112010-06-112010-06-112010-06-11Bibliographically approved