Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Human perception in speech processing
KTH, School of Electrical Engineering (EES).
2006 (English)Doctoral thesis, comprehensive summary (Other scientific)
Abstract [en]

The emergence of heterogeneous networks and the rapid increase of Voice over IP (VoIP) applications provide important opportunities for the telecommunications market. These opportunities come at the price of increased complexity in the monitoring of the quality of service (QoS) and the need for adaptation of transmission systems to the changing environmental conditions. This thesis contains three papers concerned with quality assessment and enhancement of speech communication systems in adverse environments.

In paper A, we introduce a low-complexity, non-intrusive algorithm for monitoring speech quality over the network. In the proposed algorithm, speech quality is predicted from a set of features that capture important structural information from the speech signal.

Papers B and C describe improvements in the conventional pre- and post-processing speech enhancement techniques. In paper B, we demonstrate that the causal Kalman filter implementation is in conflict with the key properties in human perception and propose solutions to the problem. In paper C, we propose adaptation of the conventional postfilter parameters to changes in the noisy conditions. A perceptually motivated distortion measure is used in the optimization of postfilter parameters. Significant improvement over nonadaptive system is obtained.

Place, publisher, year, edition, pages
Stockholm: KTH , 2006.
Series
Trita-EE, ISSN 1653-5146 ; 2006:016
Keyword [en]
quality assessment, speech enhancement, postfilter
National Category
Telecommunications
Identifiers
URN: urn:nbn:se:kth:diva-4032ISBN: 91-628-6864-0 (print)OAI: oai:DiVA.org:kth-4032DiVA: diva2:10490
Public defence
2006-06-15, E2, Lindstedtsvägen 3, 09:00
Opponent
Supervisors
Note
QC 20100824Available from: 2006-06-02 Created: 2006-06-02 Last updated: 2010-08-24Bibliographically approved
List of papers
1. On causal algorithms for speech enhancement
Open this publication in new window or tab >>On causal algorithms for speech enhancement
2006 (English)In: IEEE Transactions on Speech and Audio Processing., ISSN 1558-7916, Vol. 14, 764-773 p.Article in journal (Refereed) Published
Abstract [en]

Kalman filtering is a powerful technique for the estimation of a signal, observed in noise that can be used to enhance speech observed in the presence of acoustic background noise. In a speech communication system, the speech signal is typically buffered for a period of 10-40 ms and, therefore, the use of either a causal or a noncausal filter is possible. We show that the causal Kalman algorithm is in conflict with the basic properties of human perception and address the problem of improving its perceptual quality. We discuss two approaches to improve perceptual performance. The first is based on a new method that combines the causal Kalman algorithm with pre- and postfiltering to introduce perceptual shaping of the residual noise. The second is based on the conventional Kalman smoother. We show that a short lag removes the conflict resulting from the causality constraint and we quantify the minimum lag required for this purpose. The results of our objective and subjective evaluations confirm that both approaches significantly outperform the conventional causal implementation. Of the two approaches, the Kalman smoother performs better if the signal statistics are precisely known, if this is not the case the perceptually weighted Kalman filter performs better.

Keyword
autoregressive (AR) model, causal filter, Kalman filter, Kalman smoother, optimal lag, speech enhancement
National Category
Telecommunications
Identifiers
urn:nbn:se:kth:diva-5945 (URN)10.1109/TSA.2005.857802 (DOI)000237140500004 ()2-s2.0-34047265321 (Scopus ID)
Note
QC 20100824Available from: 2006-06-02 Created: 2006-06-02 Last updated: 2011-08-25Bibliographically approved
2. Low-complexity, non-intrusive speech quality assessment
Open this publication in new window or tab >>Low-complexity, non-intrusive speech quality assessment
2006 (English)In: IEEE Transactions on Speech and Audio Processing., ISSN 1558-7916, Vol. 14, no 6, 1948-1956 p.Article in journal (Refereed) Published
Abstract [en]

Monitoring of speech quality in emerging heterogeneous networks is of great interest to network operators. The most efficient way to satisfy such a need is through nonintrusive, objective speech quality assessment. In this paper, we describe a low-complexity algorithm for monitoring the speech quality over a network. The features used in the proposed algorithm can be computed from commonly used speech-coding parameters. Reconstruction and perceptual transformation of the signal is not performed. The critical advantage of the approach lies in generating quality assessment ratings without explicit distortion modeling. The results from the performed experiments indicate that the proposed nonintrusive objective quality measure performs better than the ITU-T P.563 standard.

Keyword
nonintrusive; quality assessment; quality of service (QoS)t
National Category
Telecommunications
Identifiers
urn:nbn:se:kth:diva-5946 (URN)10.1109/TASL.2006.883250 (DOI)000241567200007 ()2-s2.0-37549009647 (Scopus ID)
Note
QC 20100824Available from: 2006-06-02 Created: 2006-06-02 Last updated: 2010-12-06Bibliographically approved
3. Generalized postfilter for speech quality enhancement
Open this publication in new window or tab >>Generalized postfilter for speech quality enhancement
2008 (English)In: IEEE Transactions on Audio, Speech and Language Processing, ISSN 1558-7916, Vol. 16, no 1, 57-64 p.Article in journal (Refereed) Published
Abstract [en]

Postfilters are commonly used in speech coding for the attenuation of quantization noise. In the presence of acoustic background noise or distortion due to tandeming operations, the postfilter parameters are not adjusted and the performance is, therefore, not optimal. We propose a modification that consists of replacing the nonadaptive postfilter parameters with parameters that adapt to variations in spectral flatness, obtained from the noisy speech. This generalization of the postfiltering concept can handle a larger range of noise conditions, but has the same computational complexity and memory requirements as the conventional postfilter. Test results indicate that the presented algorithm improves on the standard postfilter, as well as on the combination of a noise attenuation preprocessor and the conventional postfilter.

Keyword
Additive noise; Distortion measure; Multiplicative noise; Noise reduction; Perceptually optimal processing; Postfilter; Speech coding; Speech enhancement; Tandeming
National Category
Telecommunications
Identifiers
urn:nbn:se:kth:diva-5947 (URN)10.1109/TASL.2007.909327 (DOI)000251947000006 ()2-s2.0-64849092071 (Scopus ID)
Note
QC 20100824Available from: 2006-06-02 Created: 2006-06-02 Last updated: 2011-08-25Bibliographically approved

Open Access in DiVA

fulltext(777 kB)2338 downloads
File information
File name FULLTEXT01.pdfFile size 777 kBChecksum MD5
370eea2823fd4007cc914571a2b2f5bdd01bc77188f770b008c99d508b07923297d300b7
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Grancharov, Volodya
By organisation
School of Electrical Engineering (EES)
Telecommunications

Search outside of DiVA

GoogleGoogle Scholar
Total: 2338 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 708 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf