kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Robust Multimedia Communications over Packet Networks
KTH, School of Electrical Engineering (EES), Sound and Image Processing.
2010 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Multimedia communications over packet networks, and in particular the voice over IP (VoIP) application, have become an integral part of society. However, the unreliable and heterogeneous nature of packet networks has led to a best-effort delivery of services. Delay, limitation of bandwidth, and packet-loss rate all affect the quality of service (QoS). In this thesis, we address two important network impairments in the design of robust multimedia communication systems: packet delay-variation and packet-loss.

Paper A considers the mitigation of the effect of packet delay-variation for audio communications by introducing a buffer at the receiver side. A new adaptive playout scheduling approach is proposed to control the buffering length, or, equivalently, the packet playout deadlines, in response to varying network conditions. A Wiener process is used to model the fluctuation of the buffering length without any playout adjustment. The playout scheduling problem is then reformulated as a stochastic impulse control problem by taking the playout adjustment as the control signal. The proposed approach is shown to be the optimal solution to the new control problem. It is demonstrated experimentally that the proposed approach provides improved perceived conversional quality.

Papers B, C and D address the packet-loss issue. Paper B focuses on the design of a low-complexity packet-loss concealment (PLC) method that is compatible with existing speech codecs for VoIP application. The new method is rigorously motivated based on the autoregressive (AR) speech model and the minimum mean squared error (MMSE) criterion. The effect of model estimation error on the prediction of the missing speech segment is also considered and an upper bound for the prediction error is derived. Both the theoretical and experimental results provide insight in the performance of the heuristically designed PLC methods. On the other hand, Paper C and D consider an active packet-loss-resilient coding scheme, namely multiple description coding (MDC). In general, MDC can be used for the transmission of any media data. Paper C derives a simple and accurate approximation of the rate-distortion lower bound of a particular multiple- description scenario and then demonstrates that the performance loss of some practical MD systems can be evaluated easily with the new approximation. Paper D studies the performance limit of a vector Gaussian multiple description scenario. An outer bound to the rate-distortion region is derived, and the outer bound is tight when the problem specializes to the scalar Gaussian case.

 

Place, publisher, year, edition, pages
Stockholm: KTH , 2010. , p. xii, 37
Series
Trita-EE, ISSN 1653-5146 ; 2010:036
National Category
Telecommunications
Identifiers
URN: urn:nbn:se:kth:diva-24223OAI: oai:DiVA.org:kth-24223DiVA, id: diva2:345510
Public defence
2010-08-30, Salongen, KTHB, Osquars backe 25, Stockholm, 10:00 (English)
Opponent
Supervisors
Note
QC20100830Available from: 2010-08-30 Created: 2010-08-25 Last updated: 2022-06-25Bibliographically approved
List of papers
1. Adaptive Playout Scheduling for Voice over IP: Event-Triggered Control Policy
Open this publication in new window or tab >>Adaptive Playout Scheduling for Voice over IP: Event-Triggered Control Policy
(English)In: IEEE Multimedia, ISSN 1070-986X, E-ISSN 1941-0166Article in journal (Other academic) Submitted
Abstract [en]

We study adaptive-playout scheduling for Voice over IP using the frame-work of stochastic impulse control theory. We use the Wiener process tomodel the fluctuation of the buffer length in the absence of control. In thiscontext, the control signal consists of length units that correspond to insert-ing or dropping a pitch cycle. We define an optimality criterion that hasan adjustable trade-off between average buffering delay and average controlsignal (the length of the pitch cycles added plus the length of the pitch cyclesdropped), and show that a band control policy is optimal for this criterion.The band control policy maintains the buffer length within a band regionby imposing impulse control (inserted or dropped pitch cycles) whenever thebounds of the band are reached. One important property of the band controlpolicy is that it incurs no packet-loss through buffering if there are no out-of-order packet-arrivals. Experiments performed on both synthetic and realnetwork-delay traces show that the proposed playout scheduling algorithmoutperforms two recent algorithms in most cases.

Keywords
Playout scheduling, jitter buffer, impulse control
National Category
Telecommunications
Identifiers
urn:nbn:se:kth:diva-24218 (URN)
Note
QS 2011 QS 20120326Available from: 2010-08-25 Created: 2010-08-25 Last updated: 2024-01-18Bibliographically approved
2. Autoregressive Model-based Speech Packet-Loss Concealment
Open this publication in new window or tab >>Autoregressive Model-based Speech Packet-Loss Concealment
2008 (English)In: 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, 2008, p. 4797-4800Conference paper, Published paper (Refereed)
Abstract [en]

We study packet-loss concealment for speech based on autoregressivemodelling using a rigorous minimum mean square error (MMSE) approach.The effect of the model estimation error on predicting the missing segment isstudied and an upper bound on the mean square error is derived. Our exper-iments show that the upper bound is tight when the estimation error is lessthan the signal variance. We also consider the usage of perceptual weightingon prediction to improve speech quality. A rigorous argument is presentedto show that perceptual weighting is not useful in this context. We createsimple and practical MMSE-based systems using two signal models: a basicmodel capturing the short-term correlation and a more sophisticated modelthat also captures the long-term correlation. Subjective quality comparisontests show that the proposed MMSE-based system provides state-of-the-artperformance.

Series
International Conference on Acoustics Speech and Signal Processing (ICASSP), ISSN 1520-6149
National Category
Telecommunications
Identifiers
urn:nbn:se:kth:diva-24219 (URN)10.1109/ICASSP.2008.4518730 (DOI)000257456703183 ()2-s2.0-51449119893 (Scopus ID)
Conference
33rd IEEE International Conference on Acoustics, Speech and Signal Processing Las Vegas, NV, MAR 30-APR 04, 2008
Note
QC20100830Available from: 2010-08-25 Created: 2010-08-25 Last updated: 2024-01-18Bibliographically approved
3. High-Rate Analysis of Symmetric L-Channel Multiple Description Coding
Open this publication in new window or tab >>High-Rate Analysis of Symmetric L-Channel Multiple Description Coding
2011 (English)In: IEEE Transactions on Communications, ISSN 0090-6778, E-ISSN 1558-0857, Vol. 59, no 7, p. 1846-1856Article in journal (Refereed) Published
Abstract [en]

This paper studies the tight rate-distortion bound for L-channel sym-metric multiple-description coding of a vector Gaussian source with twolevels of receivers. Each of the first-level receivers obtains κ (κ < L) ofthe L descriptions. The second-level receiver obtains all L descriptions.We find that when the theory is applied to the scalar Gaussian source, theproduct of a function of the side distortions (corresponding to the first-level receivers) and the central distortion (corresponding to the second-levelreceiver) is asymptotically independent of the redundancy among the de-scriptions. Using this property, we analyze the asymptotic behavior of apractical multiple-description lattice vector quantizer (MDLVQ). Our anal-ysis includes the treatment of the MDLVQ system from a new geometricviewpoint, which results in an expression for the side distortions using thenormalized second moment of a sphere of higher dimensionality than thequantization space. The expression of the distortion product derived fromthe lower bound is then applied as a criterion to assess the performance lossof the considered MDLVQ system. In principle, the efficiency of other prac-tical MD systems can also be evaluated using the derived distortion product.

Keywords
Multiple description coding, high-rate quantization, lattice quantizer
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:kth:diva-24220 (URN)10.1109/TCOMM.2011.051711.100254 (DOI)000293659600013 ()2-s2.0-79960559600 (Scopus ID)
Available from: 2010-08-25 Created: 2010-08-25 Last updated: 2024-03-15Bibliographically approved
4. Bounding the Rate Region of Vector Gaussian Multiple Descriptions with Individual and Central Receivers
Open this publication in new window or tab >>Bounding the Rate Region of Vector Gaussian Multiple Descriptions with Individual and Central Receivers
2010 (English)In: Data Compression Conference Proceedings, 2010, p. 13-19Conference paper, Published paper (Refereed)
Abstract [en]

In this work, we study the rate region of the vector Gaussian multipledescription problem with individual and central quadratic distortion con-straints. In particular, we derive an outer bound to the rate region of theL-description problem. The bound is obtained by lower bounding a weightedsum rate for each supporting hyperplane of the rate region. The key ideais to introduce at most L-1 auxiliary random variables and further imposeupon the variables a Markov structure according to the ordering of the de-scription weights. This makes it possible to greatly simplify the derivationof the outer bound. In the scalar Gaussian case, the complete rate regionis fully characterized by showing that the outer bound is tight. In this case,the optimal weighted sum rate for each supporting hyperplane is obtained bysolving a single maximization problem. This contrasts with existing results,which require solving a min-max optimization problem.

National Category
Telecommunications
Identifiers
urn:nbn:se:kth:diva-24221 (URN)10.1109/DCC.2010.9 (DOI)000397228500002 ()2-s2.0-77952716951 (Scopus ID)
Conference
Data Compression Conference, DCC 2010; Snowbird, UT; 24 March 2010 through 26 March 2010
Note

QC20100830

Available from: 2010-08-25 Created: 2010-08-25 Last updated: 2024-03-15Bibliographically approved

Open Access in DiVA

fulltext(520 kB)649 downloads
File information
File name FULLTEXT01.pdfFile size 520 kBChecksum SHA-512
b1a6d193a8a4f262e9121b282ff090a155edfc01fc61f26cc19cc6c61e1419845f8ee2b078fae05e1e8a6fc8130f087419be911c95da160e061b9310584fe989
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Zhang, Guoqiang
By organisation
Sound and Image Processing
Telecommunications

Search outside of DiVA

GoogleGoogle Scholar
Total: 649 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 621 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf