Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
On causal algorithms for speech enhancement
KTH, School of Electrical Engineering (EES), Sound and Image Processing.
KTH, School of Electrical Engineering (EES), Sound and Image Processing.
KTH, School of Electrical Engineering (EES), Sound and Image Processing.
2006 (English)In: IEEE Transactions on Speech and Audio Processing., ISSN 1558-7916, Vol. 14, 764-773 p.Article in journal (Refereed) Published
Abstract [en]

Kalman filtering is a powerful technique for the estimation of a signal, observed in noise that can be used to enhance speech observed in the presence of acoustic background noise. In a speech communication system, the speech signal is typically buffered for a period of 10-40 ms and, therefore, the use of either a causal or a noncausal filter is possible. We show that the causal Kalman algorithm is in conflict with the basic properties of human perception and address the problem of improving its perceptual quality. We discuss two approaches to improve perceptual performance. The first is based on a new method that combines the causal Kalman algorithm with pre- and postfiltering to introduce perceptual shaping of the residual noise. The second is based on the conventional Kalman smoother. We show that a short lag removes the conflict resulting from the causality constraint and we quantify the minimum lag required for this purpose. The results of our objective and subjective evaluations confirm that both approaches significantly outperform the conventional causal implementation. Of the two approaches, the Kalman smoother performs better if the signal statistics are precisely known, if this is not the case the perceptually weighted Kalman filter performs better.

Place, publisher, year, edition, pages
2006. Vol. 14, 764-773 p.
Keyword [en]
autoregressive (AR) model, causal filter, Kalman filter, Kalman smoother, optimal lag, speech enhancement
National Category
Telecommunications
Identifiers
URN: urn:nbn:se:kth:diva-5945DOI: 10.1109/TSA.2005.857802ISI: 000237140500004Scopus ID: 2-s2.0-34047265321OAI: oai:DiVA.org:kth-5945DiVA: diva2:10487
Note
QC 20100824Available from: 2006-06-02 Created: 2006-06-02 Last updated: 2011-08-25Bibliographically approved
In thesis
1. Human perception in speech processing
Open this publication in new window or tab >>Human perception in speech processing
2006 (English)Doctoral thesis, comprehensive summary (Other scientific)
Abstract [en]

The emergence of heterogeneous networks and the rapid increase of Voice over IP (VoIP) applications provide important opportunities for the telecommunications market. These opportunities come at the price of increased complexity in the monitoring of the quality of service (QoS) and the need for adaptation of transmission systems to the changing environmental conditions. This thesis contains three papers concerned with quality assessment and enhancement of speech communication systems in adverse environments.

In paper A, we introduce a low-complexity, non-intrusive algorithm for monitoring speech quality over the network. In the proposed algorithm, speech quality is predicted from a set of features that capture important structural information from the speech signal.

Papers B and C describe improvements in the conventional pre- and post-processing speech enhancement techniques. In paper B, we demonstrate that the causal Kalman filter implementation is in conflict with the key properties in human perception and propose solutions to the problem. In paper C, we propose adaptation of the conventional postfilter parameters to changes in the noisy conditions. A perceptually motivated distortion measure is used in the optimization of postfilter parameters. Significant improvement over nonadaptive system is obtained.

Place, publisher, year, edition, pages
Stockholm: KTH, 2006
Series
Trita-EE, ISSN 1653-5146 ; 2006:016
Keyword
quality assessment, speech enhancement, postfilter
National Category
Telecommunications
Identifiers
urn:nbn:se:kth:diva-4032 (URN)91-628-6864-0 (ISBN)
Public defence
2006-06-15, E2, Lindstedtsvägen 3, 09:00
Opponent
Supervisors
Note
QC 20100824Available from: 2006-06-02 Created: 2006-06-02 Last updated: 2010-08-24Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Grancharov, VolodyaSamuelsson, JonasKleijn, Bastiaan
By organisation
Sound and Image Processing
Telecommunications

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 2143 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf