In the lossy coding of perceptually relevant signals, such as sound and images, the ultimate goal is to achieve good perceived quality of the reconstructed signal, under a constraint on the bit-rate. Conventional methodologies focus either on a rate-distortion optimization or on the preservation of signal features. Technologies resulting from these two perspectives are efficient only for high-rate or low-rate scenarios. In this dissertation, a new objective is proposed: to seek the optimal rate-distortion trade-off under a constraint that statistical properties of the reconstruction are similar to those of the source.
The new objective leads to a new quantization concept: distribution preserving quantization (DPQ). DPQ preserves the probability distribution of the source by stochastically switching among an ensemble of quantizers. At low rates, DPQ exhibits a synthesis nature, resembling existing coding methods that preserve signal features. Compared with rate-distortion optimized quantization, DPQ yields some rate-distortion performance for perceptual benefits.
The rate-distortion optimization for DPQ facilitates mathematical analysis. The dissertation defines a distribution preserving rate-distortion function (DP-RDF), which serves as a lower bound on the rate of any DPQ method for a given distortion. For a large range of sources and distortion measures, the DP-RDF approaches the classic rate-distortion function with increasing rate. This suggests that, at high rates, an optimal DPQ can approach conventional quantization in terms of rate-distortion characteristics.
After verifying the perceptual advantages of DPQ with a relatively simple realization, this dissertation focuses on a method called transformation-based DPQ, which is based on dithered quantization and a non-linear transformation. Asymptotically, with increasing dimensionality, a transformation-based DPQ achieves the DP-RDF for i.i.d. Gaussian sources and the mean squared error (MSE).
This dissertation further proposes a DPQ scheme that asymptotically achieves the DP-RDF for stationary Gaussian processes and the MSE. For practical applications, this scheme can be reduced to dithered quantization with pre- and post-filtering. The simplified scheme preserves the power spectral density (PSD) of the source.
The use of dithered quantization and non-linear transformations to construct DPQ is extended to multiple description coding, which leads to a multiple description DPQ (MD-DPQ) scheme. MD-DPQ preserves the source probability distribution for any packet loss scenario.
The proposed schemes generally require efficient entropy coding. The dissertation also includes an entropy coding algorithm for lossy coding systems, which is referred to as sequential entropy coding of quantization indices with update recursion on probability (SECURE).
The proposed lossy coding methods were subjected to evaluations in the context of audio coding. The experimental results confirm the benefits of the methods and, therewith, the effectiveness of the proposed new lossy coding objective.
Stockholm: KTH Royal Institute of Technology , 2011. , xiii, 69 p.
Kleijn, W. Bastiaan, Professor