Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Efficient whole genome haplotyping and single molecule phasing with barcode-linked reads
KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Gene Technology. (Experimental Genomics)
KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Gene Technology. (Experimental Genomics)
KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Gene Technology. (Experimental Genomics)
KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Gene Technology. (Experimental Genomics)
Show others and affiliations
(English)Manuscript (preprint) (Other academic)
Abstract [en]

The future of human genomics is one that seeks to resolve the entirety of genetic variation through sequencing. The prospect of utilizing genomics for medical purposes require cost-efficient and accurate base calling, long-range haplotyping capability, and reliable calling of structural variants. Short-read sequencing has lead the development towards such a future but has struggled to meet the latter two of these needs. To address this limitation, we developed a technology that preserves the molecular origin of short sequencing reads, with an insignificant increase to sequencing costs. We demonstrate a library preparation method which enables whole genome haplotyping, long-range phasing of single DNA molecules, and de novo genome assembly through barcode-linked reads (BLR). Millions of random barcodes are used to reconstruct megabase-scale phase blocks and call structural variants. We also highlight the versatility of our technology by generating libraries from different organisms using picograms to nanograms of input material.

Keywords [en]
Whole genome haplotyping, single molecule phasing, de novo assembly, barcode-linked reads, DNA phasing, BLR, droplet barcoding, whole genome sequencing, linked-read sequencing.
National Category
Medical Biotechnology (with a focus on Cell Biology (including Stem Cell Biology), Molecular Biology, Microbiology, Biochemistry or Biopharmacy)
Research subject
Biotechnology
Identifiers
URN: urn:nbn:se:kth:diva-235227OAI: oai:DiVA.org:kth-235227DiVA, id: diva2:1249220
Funder
Stiftelsen Olle Engkvist Byggmästare, 2015/347
Note

QC 20180919

Available from: 2018-09-18 Created: 2018-09-18 Last updated: 2018-09-19Bibliographically approved
In thesis
1. Phasing single DNA molecules with barcode linked sequencing
Open this publication in new window or tab >>Phasing single DNA molecules with barcode linked sequencing
2018 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Elucidation of our genetic constituents has in the past decade predominately taken the form of short-read DNA sequencing. Revolutionary technology developments have enabled vast amounts of biological information to be obtained, but from a medical standpoint it has yet to live up to the promise of associating individual genotypes to phenotypic states of wide-spread clinical relevance. The mechanisms by which complex phenotypes arise have been difficult to ascertain and the value of short-read sequencing platforms have been limited in this regard. It has become evident that resolving the full spectrum of genetic heterogeneity requires accurate long range information of individual haplotypes to be distinguished. Long-range haplotyping information can be obtained experimentally by long-read sequencing platforms or through linkage of short sequencing reads by means of a common barcode. This thesis explores these solutions, primarily through the development of novel technologies to phase short sequences of single molecules using DNA barcoding. A new method for high-throughput phasing of single DNA molecules, achieved by the production and utilization of uniquely barcoded beads in emulsion droplets, is described in Paper I. The results confirm that complex libraries of beads featuring mutually exclusive barcodes can be generated through clonal PCR amplification, and that these beads can be used to phase variations of the 16s rRNA gene which reduces the ambiguity of classifying bacterial species for metagenomics. Paper II describes a second methodology (‘Droplet Barcode Sequencing’) which simplifies the concept of barcoding DNA fragments by omitting the need for beads and instead relying on clonal amplification of single barcoding oligonucleotides. This study also increases the amount of information that can be linked, which is showcased by phasing all exons of the HLA-A gene and successfully resolving all the alleles present in a sample pool of eight individuals. Paper III expands on this work and explores the use of a single molecule sequencing platform to provide full-length sequencing coverage of six genes of the HLA family. The results show that while genes shorter than 10 kb can be resolved with a high degree of accuracy, compensating for a relatively high error rate by means of increased coverage can be challenging for larger genomic loci. Finally, Paper IV introduces the use of barcode-linked reads on an unprecedented scale, with a new assay that enables low-cost haplotyping of whole genomes without the need for predetermined capture sequences. This technology is utilized to generate a haplotype-resolved human genome, call large-scale structural variants and perform reference-free assembly of bacterial and human genomes. At a cost of only $19 USD per sample, this technology makes the benefits of long-range haplotyping available to the vast majority of laboratories which currently rely solely on short-read sequencing platforms.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2018. p. 45
Series
TRITA-CBH-FOU ; 2018:41
Keywords
Single molecule sequencing, DNA barcoding, whole genome haplotyping, linked-read sequencing, phasing, de novo genome assembly.
National Category
Medical Biotechnology (with a focus on Cell Biology (including Stem Cell Biology), Molecular Biology, Microbiology, Biochemistry or Biopharmacy)
Research subject
Biotechnology
Identifiers
urn:nbn:se:kth:diva-235187 (URN)978-91-7729-939-4 (ISBN)
Public defence
2018-10-19, Air & Fire Auditorium, Tomtebodavägen 23, Solna, 10:00 (English)
Opponent
Supervisors
Note

QC 20180919

Available from: 2018-09-19 Created: 2018-09-19 Last updated: 2018-09-19Bibliographically approved

Open Access in DiVA

BLR.incl.SI(9651 kB)43 downloads
File information
File name FULLTEXT01.pdfFile size 9651 kBChecksum SHA-512
f07bdf37c9cdee2a4c0ed3f60594b641051baee09cee2ed4b25d107c50ff6713b285d75686758c107b4a05be537ab09c98310b340fd6ca57dbcc0c6953724707
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Redin, DavidFrick, TobiasAghelpasand, HoomanTheland, JenniferKäller, MaxBorgström, ErikAhmadian, Afshin
By organisation
Gene Technology
Medical Biotechnology (with a focus on Cell Biology (including Stem Cell Biology), Molecular Biology, Microbiology, Biochemistry or Biopharmacy)

Search outside of DiVA

GoogleGoogle Scholar
Total: 43 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 445 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf