Change search
ReferencesLink to record
Permanent link

Direct link
Speech denoising through source separation and min-max tracking
KTH, School of Electrical Engineering (EES), Sound and Image Processing.
KTH, School of Electrical Engineering (EES), Sound and Image Processing.
(English)In: IEEE Signal Processing Letters, ISSN 1070-9908, E-ISSN 1558-2361Article in journal (Refereed) Submitted
National Category
Telecommunications
Identifiers
URN: urn:nbn:se:kth:diva-7737OAI: oai:DiVA.org:kth-7737DiVA: diva2:12852
Note
QC 20100929Available from: 2005-10-20 Created: 2005-10-20 Last updated: 2010-09-29Bibliographically approved
In thesis
1. Knowledge-based speech enhancement
Open this publication in new window or tab >>Knowledge-based speech enhancement
2005 (English)Doctoral thesis, comprehensive summary (Other scientific)
Abstract [en]

Speech is a fundamental means of human communication. In the last several decades, much effort has been devoted to the efficient transmission and storage of speech signals. With advances in technology making mobile communication ubiquitous, communications anywhere has become a reality. The freedom and flexibility offered by mobile technology brings with it new challenges, one of which is robustness to acoustic background noise. Speech enhancement systems form a vital front-end for mobile telephony in noisy environments such as in cars, cafeterias, subway stations, etc., in hearing aids, and to improve the performance of speech recognition systems.

In this thesis, which consists of four research articles, we discuss both single and multi-microphone approaches to speech enhancement. The main contribution of this thesis is a framework to exploit available prior knowledge about both speech and noise. The physiology of speech production places a constraint on the possible shapes of the speech spectral envelope, and this information s captured using codebooks of speech linear predictive (LP) coefficients obtained from a large training database. Similarly, information about commonly occurring noise types is captured using a set of noise codebooks, which can be combined with sound environment classi¯cation to treat different environments differently. In paper A, we introduce maximum-likelihood estimation of the speech and noise LP parameters using the codebooks. The codebooks capture only the spectral shape. The speech and noise gain factors are obtained through a frame-by-frame optimization, providing good performance in practical nonstationary noise environments. The estimated parameters are subsequently used in a Wiener filter. Paper B describes Bayesian minimum mean squared error estimation of the speech and noise LP parameters and functions there-of, while retaining the in- stantaneous gain computation. Both memoryless and memory-based estimators are derived.

While papers A and B describe single-channel techniques, paper C describes a multi-channel Bayesian speech enhancement approach, where, in addition to temporal processing, the spatial diversity provided by multiple microphones s also exploited. In paper D, we introduce a multi-channel noise reduction technique motivated by blind source separation (BSS) concepts. In contrast to standard BSS approaches, we use the knowledge that one of the signals is speech and that the other is noise, and exploit their different characteristics.

Place, publisher, year, edition, pages
Stockholm: KTH, 2005. xii, 61 p.
Series
Trita-S3-SIP, ISSN 1652-4500 ; 2005:1
Keyword
speech enhancement, noise reduction, linear predictive coe±cients, autoregressive, codebooks, maximum-likelihood, Bayesian, nonstationary noise, blind source separation.
National Category
Telecommunications
Identifiers
urn:nbn:se:kth:diva-456 (URN)91-628-6643-5 (ISBN)
Public defence
2005-10-28, Sal B2, Brinellvägen 23, Stockholm, 09:00
Opponent
Supervisors
Note
QC 20100929Available from: 2005-10-20 Created: 2005-10-20 Last updated: 2010-09-29Bibliographically approved

Open Access in DiVA

No full text

Search in DiVA

By author/editor
Srinivasan, SriramKleijn, Bastiaan
By organisation
Sound and Image Processing
In the same journal
IEEE Signal Processing Letters
Telecommunications

Search outside of DiVA

GoogleGoogle Scholar

Total: 58 hits
ReferencesLink to record
Permanent link

Direct link