A Dual Channel Coupled Decoder for Fillers and Feedback
2011 (English)In: INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association, 2011, 3097-3100 p.Conference paper (Refereed)
This study presents a dual channel decoder capable of modeling cross-speaker dependencies for segmentation and classification of fillers and feedbacks in conversational speech found in the DEAL corpus. For the same number of Gaussians per state, we have shown improvement in terms of average F-score for the successive addition of 1) increased frame rate from 10 ms to 50 ms 2) Joint Maximum Cross-Correlation (JMXC) features in a single channel decoder 3) a joint transition matrix which captures dependencies symmetrically across the two channels 4) coupled acoustic model retraining symmetrically across the two channels. The final step gives a relative improvement of over 100% for fillers and feedbacks compared to our previous published results. The F-scores are in the range to make it possible to use the decoder as both a voice activity detector and an illucotary act decoder for semi-automatic annotation.
Place, publisher, year, edition, pages
2011. 3097-3100 p.
Conversation, Coupled hidden Markov models, Cross-speaker modeling, Feedback, Filler
Computer Science Language Technology (Computational Linguistics)
IdentifiersURN: urn:nbn:se:kth:diva-52193ISI: 000316502201265ScopusID: 2-s2.0-8486579156ISBN: 978-1-61839-270-1OAI: oai:DiVA.org:kth-52193DiVA: diva2:465491
INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association. Florence, Italy. 28-31 August 2011
tmh_import_11_12_14. QC 201112222011-12-142011-12-142014-01-15Bibliographically approved