Effect of MPEG audio compression on HMM-based speech synthesis
2013 (English)In: Proceedings of the 14th Annual Conference of the International Speech Communication Association: Interspeech 2013. International Speech Communication Association (ISCA), 2013, 2013, 1062-1066 p.Conference paper (Refereed)
In this paper, the effect of MPEG audio compression on HMMbased speech synthesis is studied. Speech signals are encoded with various compression rates and analyzed using the GlottHMM vocoder. Objective evaluation results show that the vocoder parameters start to degrade from encoding with bitrates of 32 kbit/s or less, which is also confirmed by the subjective evaluation of the vocoder analysis-synthesis quality. Experiments with HMM-based speech synthesis show that the subjective quality of a synthetic voice trained with 32 kbit/s speech is comparable to a voice trained with uncompressed speech, but lower bit rates induce clear degradation in quality.
Place, publisher, year, edition, pages
2013. 1062-1066 p.
, Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, ISSN 2308-457X
GlottHMM, HMM, MP3, Speech synthesis, Audio signal processing, Motion Picture Experts Group standards, Vocoders, Analysis-synthesis, HMM-based speech synthesis, Objective evaluation, Subjective evaluations, Subjective quality, Quality control
Language Technology (Computational Linguistics)
IdentifiersURN: urn:nbn:se:kth:diva-150864ScopusID: 2-s2.0-84906262154OAI: oai:DiVA.org:kth-150864DiVA: diva2:745851
14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013, 25 August 2013 through 29 August 2013, Lyon, France
QC 201409112014-09-112014-09-112014-09-11Bibliographically approved