Change search
ReferencesLink to record
Permanent link

Direct link
Towards flexible audio coding
KTH, Superseded Departments, Signals, Sensors and Systems.
2004 (English)Doctoral thesis, comprehensive summary (Other scientific)
Abstract [en]

thesis is about audio coding and improving flexibility thereof. Audio coding is used to reduce the bit rate needed to represent audio signals in digital format. When the available bit rate is low, parametric coding methods that use models to describe perceptually-important features of audio signals have shown their efficiency. We introduce several improvements to sinusoidal coding, a parametric coding method that is based on sinusoidal modeling of audio. Flexibility is important since new applications bring a need for coders that can operate over a large range of possible bit rates and are able to represent different types of audio material. Flexibility can be obtained by combining into one coder a set of subcoders that can efficiently represent different types or features of audio signals, and properly complement each other at different rates. For true flexibility, methods that allow fast coder adaptation should be deployed. We develop methods for rate-distortion optimal real-time design of quantizers in audio coding.

This thesis consists of seven research papers. In paper A, we introduce a signal pre-processing method that facilitates removal of pre-echo artifacts when coding signals containing transients (sharp attacks). Papers B-F are devoted to sinusoidal audio coding. In papers B and C, we present improvements to the matching-pursuit sinusoidal estimation method. In papers D, E, and F, we consider quantization of sinusoidal parameters. We apply high-rate quantization theory to find the asymptotically optimal rate distribution between sinusoids and the corresponding asymptotically optimal quantizers for sinusoidal parameters, such that a perceptual distortion measure is minimized under a given rate constraint. The quantizers are derived analytically, which allows a coder to adapt quickly to changing bit-rate requirements. Paper G is devoted to multistage audio coding, a coding method where subcoders are combined in a cascaded way. We use high-rate theory to develop a flexible analytical framework for the asymptotically optimal rate distribution between subcoders and the design of the corresponding asymptotically optimal quantizers.

Place, publisher, year, edition, pages
Stockholm: Signaler, sensorer och system , 2004. , xiv, 45 p.
Trita-S3-SIP, ISSN 1652-4500 ; 2004:4
Keyword [en]
Electronics, Audio coding, multistage coding, waveform coding, parametric coding, sinusoidal coding
Keyword [sv]
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
URN: urn:nbn:se:kth:diva-71ISBN: 91-628-6161-1OAI: diva2:14692
Public defence
2004-12-09, kollegiesalen, Valhallavägen79, Stockholm, 09:00
Available from: 2004-12-20 Created: 2004-12-20 Last updated: 2012-03-21

Open Access in DiVA

No full text

Search in DiVA

By author/editor
Vafin, Renat
By organisation
Signals, Sensors and Systems
Other Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 289 hits
ReferencesLink to record
Permanent link

Direct link