Auditory model based optimization of MFCCs improves automatic speech recognition performance
2009 (English)In: INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, 2009, 2943-2946 p.Conference paper (Refereed)
Using a spectral auditory model along with perturbation based analysis, we develop a new framework to optimize a set of features such that it emulates the behavior of the human auditory system. The optimization is carried out in an off-line manner based on the conjecture that the local geometries of the feature domain and the perceptual auditory domain should be similar. Using this principle, we modify and optimize the static mel frequency cepstral coefficients (MFCCs) without considering any feedback from the speech recognition system. We show that improved recognition performance is obtained for any environmental condition, clean as well as noisy.
Place, publisher, year, edition, pages
2009. 2943-2946 p.
ASR, Auditory model, MFCC
IdentifiersURN: urn:nbn:se:kth:diva-11468ISI: 000276842801277ScopusID: 2-s2.0-70450221097ISBN: 978-1-61567-692-7OAI: oai:DiVA.org:kth-11468DiVA: diva2:276951
10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009; Brighton; 6 September 2009 - 10 September 2009
QC 201010152009-11-132009-11-132012-09-14Bibliographically approved