Endre søk
Begrens søket
1 - 11 of 11
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Treff pr side
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Disputationsdatum (tidligste først)
  • Disputationsdatum (siste først)
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Disputationsdatum (tidligste først)
  • Disputationsdatum (siste først)
Merk
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1. Grancharov, Volodya
    et al.
    Zhao, David
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Lindblom, Jonas
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Kleijn, Bastiaan
    KTH, Skolan för elektro- och systemteknik (EES).
    Low-complexity, non-intrusive speech quality assessment2006Inngår i: IEEE Transactions on Speech and Audio Processing., ISSN 1558-7916, Vol. 14, nr 6, s. 1948-1956Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Monitoring of speech quality in emerging heterogeneous networks is of great interest to network operators. The most efficient way to satisfy such a need is through nonintrusive, objective speech quality assessment. In this paper, we describe a low-complexity algorithm for monitoring the speech quality over a network. The features used in the proposed algorithm can be computed from commonly used speech-coding parameters. Reconstruction and perceptual transformation of the signal is not performed. The critical advantage of the approach lies in generating quality assessment ratings without explicit distortion modeling. The results from the performed experiments indicate that the proposed nonintrusive objective quality measure performs better than the ITU-T P.563 standard.

  • 2.
    Grancharov, Volodya
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Zhao, David Yuheng
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Lindblom, Jonas
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Kleijn, W. Bastiaan
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Non-Intrusive Speech Quality Assessment with Low Computational Complexity2006Inngår i: INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, BAIXAS: ISCA-INST SPEECH COMMUNICATION ASSOC , 2006, s. 189-192Konferansepaper (Fagfellevurdert)
    Abstract [en]

    We describe an algorithm for monitoring subjective speech quality without access to the original signal that has very low computational and memory requirements. The features used in the proposed algorithm can be computed from commonly used speech-coding parameters. Reconstruction and perceptual transformation of the signal are not performed. The algorithm generates quality assessment ratings without explicit distortion modeling. The simulation results indicate that the proposed non-intrusive objective quality measure performs better than the ITU-T P.563 standard despite its very low computational complexity.

  • 3.
    Plasberg, Jan H.
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Signalbehandling.
    Zhao, D. Y.
    KTH, Skolan för elektro- och systemteknik (EES), Signalbehandling.
    Kleijn, W. B.
    KTH, Skolan för elektro- och systemteknik (EES), Signalbehandling.
    The sensitivity matrix for a spectro-temporal auditory model2015Inngår i: European Signal Processing Conference, European Signal Processing Conference, EUSIPCO , 2015, s. 1673-1676Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Perceptually optimal processing of speech and audio signals demands distortion measures that are based on sophisticated auditory models. High-rate theory can simplify these models by means of a sensitivity matrix. We present a method to derive the sensitivity matrix for distortion measures based on spectro-temporal auditory models under the assumption of small errors. This method is applied to an example auditory model and the region of validity of the approximation as well as a way to analyze the characteristics of the model with subspace methods are discussed.

  • 4.
    Zhao, David Y
    et al.
    KTH, Tidigare Institutioner (före 2005), Signaler, sensorer och system.
    Kleijn, W Bastiaan
    KTH, Tidigare Institutioner (före 2005), Signaler, sensorer och system.
    Multiple-description vector quantization using translated lattices with local optimization2004Inngår i: GLOBECOM: IEEE Global Telecommunications Conference, NEW YORK: IEEE , 2004, s. 41-45Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Multiple-description coding is a joint source- and channel coding technique suitable for real-time multimedia transmission over erasure channels. This work improves the previous methods of multiple-description vector quantization using lattice structured codebooks by introducing translated lattices in the single-description codebooks. The quantizer can easily adapt to the current channel condition, using the locally optimized combined-description codebooks, assuming that channel statistics are available at the encoder. Compared to previous methods, the central distortion is greatly reduced for noisy channels, without a significant effect on complexity.

  • 5.
    Zhao, David Y.
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Samuelsson, Jonas
    Nilsson, Mattias
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Gmm-based entropy-constrained vector quantization2007Inngår i: 2007 IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol IV, Pts 1-3, 2007, s. 1097-1100Konferansepaper (Fagfellevurdert)
    Abstract [en]

    In this paper, we present a scalable entropy-constrained vector quantizer based on Gaussian mixture models (GMMs), lattice quantization, and arithmetic coding. We assume that the source has a probability density function of a GMM. The scheme is based on a mixture component classifier, the Karhunen Loeve transform of the component, followed by a lattice quantization. The scalar elements of the quantized vector are entropy coded using a specially designed arithmetic coder. The proposed scheme has a computational complexity that is independent of rate, and quadratic with respect to vector dimension. The design is flexible and allows for adjusting the desired target rate on-the-fly. We evaluated the performance of the proposed scheme on speech-derived source vectors. It was demonstrated that the proposed scheme outperforms a fixed-rate GMM based vector quantizer, and performs closely to the theoretical optimum.

  • 6.
    Zhao, David Yuheng
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Model Based Speech Enhancement and Coding2007Doktoravhandling, med artikler (Annet vitenskapelig)
    Abstract [en]

    In mobile speech communication, adverse conditions, such as noisy acoustic environments and unreliable network connections, may severely degrade the intelligibility and natural- ness of the received speech quality, and increase the listening effort. This thesis focuses on countermeasures based on statistical signal processing techniques. The main body of the thesis consists of three research articles, targeting two specific problems: speech enhancement for noise reduction and flexible source coder design for unreliable networks.

    Papers A and B consider speech enhancement for noise reduction. New schemes based on an extension to the auto-regressive (AR) hidden Markov model (HMM) for speech and noise are proposed. Stochastic models for speech and noise gains (excitation variance from an AR model) are integrated into the HMM framework in order to improve the modeling of energy variation. The extended model is referred to as a stochastic-gain hidden Markov model (SG-HMM). The speech gain describes the energy variations of the speech phones, typically due to differences in pronunciation and/or different vocalizations of individual speakers. The noise gain improves the tracking of the time-varying energy of non-stationary noise, e.g., due to movement of the noise source. In Paper A, it is assumed that prior knowledge on the noise environment is available, so that a pre-trained noise model is used. In Paper B, the noise model is adaptive and the model parameters are estimated on-line from the noisy observations using a recursive estimation algorithm. Based on the speech and noise models, a novel Bayesian estimator of the clean speech is developed in Paper A, and an estimator of the noise power spectral density (PSD) in Paper B. It is demonstrated that the proposed schemes achieve more accurate models of speech and noise than traditional techniques, and as part of a speech enhancement system provide improved speech quality, particularly for non-stationary noise sources.

    In Paper C, a flexible entropy-constrained vector quantization scheme based on Gaus- sian mixture model (GMM), lattice quantization, and arithmetic coding is proposed. The method allows for changing the average rate in real-time, and facilitates adaptation to the currently available bandwidth of the network. A practical solution to the classical issue of indexing and entropy-coding the quantized code vectors is given. The proposed scheme has a computational complexity that is independent of rate, and quadratic with respect to vector dimension. Hence, the scheme can be applied to the quantization of source vectors in a high dimensional space. The theoretical performance of the scheme is analyzed under a high-rate assumption. It is shown that, at high rate, the scheme approaches the theoretically optimal performance, if the mixture components are located far apart. The practical performance of the scheme is confirmed through simulations on both synthetic and speech-derived source vectors.

    Fulltekst (pdf)
    FULLTEXT01
  • 7.
    Zhao, David Yuheng
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Kleijn, Bastiaan
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    HMM-based gain-modeling for enhancement of speech in noise2007Inngår i: IEEE transactions on speech and audio processing, ISSN 1063-6676, E-ISSN 1558-2353, Vol. 15, nr 3, s. 882-892Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Accurate modeling and estimation of speech and noise gains facilitate good performance of speech. enhancement methods using data-driven prior models. In this paper, we propose a hidden Markov model (HMM)-based speech enhancement method using explicit gain modeling. Through the introduction of stochastic gain variables, energy variation in both speech and noise is explicitly modeled in a unified framework. The speech gain models the energy variations of the speech phones, typically due to differences in pronunciation and/or different vocalizations of individual speakers. The noise gain helps to improve the tracking of the time-varying energy of nonstationary noise. The expectationmaximization (EM) algorithm is used to perform offline estimation of the time-invariant model parameters. The time-varying model'parameters are estimated online using the recursive EM algorithm. The. proposed gain modeling techniques are applied to a novel Bayesian speech estimator, and the performance of the proposed enhancement method is evaluated through objective and subjective tests. The experimental results confirm the advantage of explicit gain modeling, particularly for nonstationary noise sources.

  • 8.
    Zhao, David Yuheng
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Kleijn, Bastiaan
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Ypma, Alexander
    GN ReSound AS, Algorithm R&D, Eindhoven.
    de Vries, Bert
    GN ReSound AS, Algorithm R&D, Eindhoven.
    Online noise estimation using stochastic-gain HMM for speech enhancement2008Inngår i: IEEE transactions on speech and audio processing, ISSN 1063-6676, E-ISSN 1558-2353, Vol. 16, nr 4, s. 835-846Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    We propose a noise estimation algorithm for single-channel noise suppression in dynamic noisy environments. A stochastic-gain hidden Markov model (SG-HMM) is used to model the statistics of nonstationary noise with time-varying energy. The noise model is adaptive and the model parameters are estimated online from noisy observations using a recursive estimation algorithm. The parameter estimation is derived for the maximum-likelihood criterion and the algorithm is based on the recursive expectation maximization (EM) framework. The proposed method facilitates continuous adaptation to changes of both noise spectral shapes and noise energy levels, e.g., due to movement of the noise source. Using the estimated noise model, we also develop an estimator of the noise power spectral density (PSD) based on recursive averaging of estimated noise sample spectra. We demonstrate that the proposed scheme achieves more accurate estimates of the noise model and noise PSD, and as part of a speech enhancement system facilitates a lower level of residual noise.

  • 9.
    Zhao, David Yuheng
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Kleijn, W. Bastiaan
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    HMM-based speech enhancement using explicit gain modeling2006Inngår i: 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, 2006, s. 161-164Konferansepaper (Fagfellevurdert)
    Abstract [en]

    We propose a hidden Markov model (HMM) based speech enhancement method using explicit modeling of speech and noise gains. The gains are considered to be stochastic variables in an HMM framework. The speech gain models the energy variations of speech phones, typically due to differences in pronunciation and/or different vocalizations of individual speakers. The noise gain helps to improve the tracking of the time-varying energy of non-stationary noise. The time-varying parameters of the gain models are estimated on-line using the recursive expectation maximization (EM) algorithm. The performance of the proposed enhancement system is evaluated through both objective and subjective tests. The experimental results confirm the advantage of explicit gain modeling, particularly for non-stationary noise sources.

  • 10.
    Zhao, David Yuheng
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Kleijn, W. Bastiaan
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    On noise gain estimation for HMM-based speech enhancement2005Inngår i: 9th European Conference on Speech Communication and Technology, 2005, s. 2113-2116Konferansepaper (Fagfellevurdert)
    Abstract [en]

    To address the variation of noise level in non-stationary noise signals, we study the noise gain estimation for speech enhancement using hidden Markov models (HMM). We consider the noise gain as a stochastic process and we approximate the probability density function (PDF) to be log-normal distributed. The PDF parameters are estimated for every signal block using the past noisy signal blocks. The approximated PDF is then used in a Bayesian speech estimator minimizing the Bayes risk for a novel cost function, that allows for an adjustable level of residual noise. As a more computationally efficient alternative, we also derive the maximum likelihood (ML) estimator, assuming the noise gain to be a deterministic parameter. The performance of the proposed gain-adaptive methods are evaluated and compared to two reference methods. The experimental results show significant improvement under noise conditions with time-varying noise energy.

  • 11.
    Zhao, David Yuheng
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    Samuelsson, Jonas
    Dolby Labs, Stockholm.
    Nilsson, Mattias
    KTH, Skolan för elektro- och systemteknik (EES), Ljud- och bildbehandling.
    On entropy-constrained vector quantization using gaussian mixture models2008Inngår i: IEEE Transactions on Communications, ISSN 0090-6778, E-ISSN 1558-0857, Vol. 56, nr 12, s. 2094-2104Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    A flexible and low-complexity entropy-constrained vector quantizer (ECVQ) scheme based on Gaussian mixture models (GMMs), lattice quantization, and arithmetic coding is presented. The source is assumed to have a probability density function of a GMM. An input vector is first classified to one of the mixture components, and the Karhunen-Loeve transform of the selected mixture component is applied to the vector, followed by quantization using a lattice structured codebook. Finally, the scalar elements of the quantized vector are entropy coded sequentially using a specially designed arithmetic coder. The computational complexity of the proposed scheme is low, and independent of the coding rate in both the encoder and the decoder. Therefore, the proposed scheme serves as a lower complexity alternative to the GMM based ECVQ proposed by Gardner, Subramaniam and Rao [1]. The performance of the proposed scheme is analyzed under a high-rate assumption, and quantified for a given GMM. The practical performance of the scheme was evaluated through simulations on both synthetic and speech line spectral frequency (LSF) vectors. For LSF quantization, the proposed scheme has a comparable performance to [1] at rates relevant for speech coding (20-28 bits per vector) with lower computational complexity.

1 - 11 of 11
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf