Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
File Updates Under Random/Arbitrary Insertions and Deletions
KTH, School of Electrical Engineering (EES), Information Science and Engineering.
Show others and affiliations
2017 (English)In: IEEE Transactions on Information Theory, ISSN 0018-9448, E-ISSN 1557-9654, Vol. 63, no 10, 6487-6513 p.Article in journal (Refereed) Published
Abstract [en]

The problem of one-way file synchronization, henceforth called "file updates", is studied in this paper. Specifically, a client edits a file, where the edits are modeled by insertions and deletions (InDels). An old copy of the file is stored remotely at a data-centre, and is also available to the client. We consider the problem of throughput-and computationally-efficient communication from the client to the data-centre, to enable the data-centre to update its old copy to the newly edited file. Two models for the source files and edit patterns are studied: the random pre-edit sequence left-to-right random InDel (RPES-LtRRID) process, and the arbitrary pre-edit sequence arbitrary InDel (APES-AID) process. In both models, we consider the regime, in which the number of insertions and deletions is a small (but constant) fraction of the length of the original file. For both models, information-theoretic lower bounds on the best possible compression rates that enable file updates are derived (up to first order terms). Conversely, a simple compression algorithm using dynamic programming (DP) and entropy coding (EC), henceforth called DP-EC algorithm, achieves rates that are within constant additive gap (which diminishes as the alphabet size increases) to information-theoretic lower bounds for both models. For the RPES-LtRRID model, a dynamic-programming-run-length-compression (DP-RLC) algorithm is proposed, which achieves a compression rate matching the information-theoretic lower bound up to first order terms. Therefore, when the insertion and deletion probabilities are small (such that first order terms dominate), the achievable rate by DP-RLC is nearly optimal for the RPES-LtRRID model.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2017. Vol. 63, no 10, 6487-6513 p.
Keyword [en]
Synchronization, insertions, deletions
National Category
Computer Sciences Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
URN: urn:nbn:se:kth:diva-215355DOI: 10.1109/TIT.2017.2705100ISI: 000411021000026Scopus ID: 2-s2.0-85029944177OAI: oai:DiVA.org:kth-215355DiVA: diva2:1148117
Note

QC 20171010

Available from: 2017-10-10 Created: 2017-10-10 Last updated: 2018-01-13Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Wang, Qiwen
By organisation
Information Science and Engineering
In the same journal
IEEE Transactions on Information Theory
Computer SciencesElectrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 1 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf