Spectral Dynamics Recovery for Enhanced Speech Intelligibility in Noise
2015 (English)In: IEEE/ACM Transactions on Speech and Language Processing, Vol. 23, no 2, 327-338 p.Article in journal (Refereed) Published
Speech intelligibility in noisy environments decreases with an increase in the noise power. We hypothesize that the differences of subsequent short-term spectra of speech, which we collectively refer to as the speech spectral dynamics, can be used to characterize speech intelligibility. We propose a distortion measure to characterize the deviation of the dynamics of the noisy modified speech from the dynamics of natural speech. Optimizing this distortion measure, we derive a parametric relationship between the signal band-power before and after modification. The parametric nature of the solution ensures adaptation to the noise level, the speech statistics and a penalty on the power gain. A multi-band speech modification system based on the single-band optimal solution is designed under a total signal power constraint and evaluated in selected noise conditions. The results indicate that the proposed approach compares favorably to a reference method based on optimizing a measure of the speech intelligibility index. Very low computational complexity and high intelligibility gain make this an attractive approach for speech modification in a wide range of application scenarios.
Place, publisher, year, edition, pages
2015. Vol. 23, no 2, 327-338 p.
Environment adaptation, speech intelligibility enhancement, speech modification
IdentifiersURN: urn:nbn:se:kth:diva-145641DOI: 10.1109/TASLP.2014.2384271ISI: 000348210300009ScopusID: 2-s2.0-84921651956OAI: oai:DiVA.org:kth-145641DiVA: diva2:719374
Updated from "Pre-print" to "Article" QC 201502272014-05-232014-05-232015-02-27Bibliographically approved