Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
An uncertainty decoding approach to noise- and reverberation-robust speech recognition
KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.ORCID iD: 0000-0003-0448-3786
2013 (English)In: ICASSP IEEE Int Conf Acoust Speech Signal Process Proc, 2013, 7388-7392 p.Conference paper, Published paper (Refereed)
Abstract [en]

The generic REMOS (REverberation MOdeling for robust Speech recognition) concept is extended in this contribution to cope with additional noise components. REMOS originally embeds an explicit reverberation model into a hiddenMarkov model (HMM) leading to a relaxed conditional independence assumption for the observed feature vectors. During recognition, a nonlinear optimization problem is to be solved in order to adapt the HMMs' output probability density functions to the current reverberation conditions. The extension for additional noise components necessitates a modified numerical solver for the nonlinear optimization problem. We propose an approximation scheme based on continuous piecewise linear regression. Connected-digit recognition experiments demonstrate the potential of REMOS in reverberant and noisy environments. They furthermore reveal that the benefit of an explicit reverberation model, overcoming the conditional independence assumption, increases with increasing signal-to-noise-ratios.

Place, publisher, year, edition, pages
2013. 7388-7392 p.
Series
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, ISSN 1520-6149
Keyword [en]
automatic speech recognition, noise robustness, piecewise linear regression, reverberation robustness, uncertainty decoding
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
URN: urn:nbn:se:kth:diva-140040DOI: 10.1109/ICASSP.2013.6639098Scopus ID: 2-s2.0-84890473474ISBN: 9781479903566 (print)OAI: oai:DiVA.org:kth-140040DiVA: diva2:689556
Conference
2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013, 26 May 2013 through 31 May 2013, Vancouver, BC
Note

QC 20140121

Available from: 2014-01-21 Created: 2014-01-16 Last updated: 2014-01-21Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Thippur, Akshaya
By organisation
Computer Vision and Active Perception, CVAP
Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 86 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf