Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Droplet Barcode Sequencing for targeted linked-read haplotyping of single DNA molecules
KTH, School of Biotechnology (BIO), Gene Technology. KTH, Centres, Science for Life Laboratory, SciLifeLab.
KTH, School of Biotechnology (BIO), Gene Technology. KTH, Centres, Science for Life Laboratory, SciLifeLab. Karolinska Institute (KI), Sweden.
KTH, School of Biotechnology (BIO), Gene Technology. KTH, Centres, Science for Life Laboratory, SciLifeLab.
KTH, School of Biotechnology (BIO), Gene Technology. KTH, Centres, Science for Life Laboratory, SciLifeLab.
Show others and affiliations
2017 (English)In: Nucleic Acids Research, ISSN 0305-1048, E-ISSN 1362-4962, Vol. 45, no 13, article id e125Article in journal (Refereed) Published
Abstract [en]

Data produced with short-read sequencing technologies result in ambiguous haplotyping and a limited capacity to investigate the full repertoire of biologically relevant forms of genetic variation. The notion of haplotype-resolved sequencing data has recently gained traction to reduce this unwanted ambiguity and enable exploration of other forms of genetic variation; beyond studies of just nucleotide polymorphisms, such as compound heterozygosity and structural variations. Here we describe Droplet Barcode Sequencing, a novel approach for creating linked-read sequencing libraries by uniquely barcoding the information within single DNA molecules in emulsion droplets, without the aid of specialty reagents or microfluidic devices. Barcode generation and template amplification is performed simultaneously in a single enzymatic reaction, greatly simplifying the workflow and minimizing assay costs compared to alternative approaches. The method has been applied to phase multiple loci targeting all exons of the highly variable Human Leukocyte Antigen A (HLA-A) gene, with DNA from eight individuals present in the same assay. Barcode-based clustering of sequencing reads confirmed analysis of over 2000 independently assayed template molecules, with an average of 753 reads in support of called polymorphisms. Our results show unequivocal characterization of all alleles present, validated by correspondence against confirmed HLA database entries and haplotyping results from previous studies.

Place, publisher, year, edition, pages
Oxford University Press, 2017. Vol. 45, no 13, article id e125
National Category
Biochemistry and Molecular Biology
Identifiers
URN: urn:nbn:se:kth:diva-212628DOI: 10.1093/nar/gkx436ISI: 000406776400008Scopus ID: 2-s2.0-85026371846OAI: oai:DiVA.org:kth-212628DiVA, id: diva2:1135724
Funder
Stiftelsen Olle Engkvist Byggmästare, 2015/347Knut and Alice Wallenberg Foundation, 2011.0113Science for Life Laboratory - a national resource center for high-throughput molecular bioscience
Note

QC 20170824

Available from: 2017-08-24 Created: 2017-08-24 Last updated: 2018-09-19Bibliographically approved
In thesis
1. Phasing single DNA molecules with barcode linked sequencing
Open this publication in new window or tab >>Phasing single DNA molecules with barcode linked sequencing
2018 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Elucidation of our genetic constituents has in the past decade predominately taken the form of short-read DNA sequencing. Revolutionary technology developments have enabled vast amounts of biological information to be obtained, but from a medical standpoint it has yet to live up to the promise of associating individual genotypes to phenotypic states of wide-spread clinical relevance. The mechanisms by which complex phenotypes arise have been difficult to ascertain and the value of short-read sequencing platforms have been limited in this regard. It has become evident that resolving the full spectrum of genetic heterogeneity requires accurate long range information of individual haplotypes to be distinguished. Long-range haplotyping information can be obtained experimentally by long-read sequencing platforms or through linkage of short sequencing reads by means of a common barcode. This thesis explores these solutions, primarily through the development of novel technologies to phase short sequences of single molecules using DNA barcoding. A new method for high-throughput phasing of single DNA molecules, achieved by the production and utilization of uniquely barcoded beads in emulsion droplets, is described in Paper I. The results confirm that complex libraries of beads featuring mutually exclusive barcodes can be generated through clonal PCR amplification, and that these beads can be used to phase variations of the 16s rRNA gene which reduces the ambiguity of classifying bacterial species for metagenomics. Paper II describes a second methodology (‘Droplet Barcode Sequencing’) which simplifies the concept of barcoding DNA fragments by omitting the need for beads and instead relying on clonal amplification of single barcoding oligonucleotides. This study also increases the amount of information that can be linked, which is showcased by phasing all exons of the HLA-A gene and successfully resolving all the alleles present in a sample pool of eight individuals. Paper III expands on this work and explores the use of a single molecule sequencing platform to provide full-length sequencing coverage of six genes of the HLA family. The results show that while genes shorter than 10 kb can be resolved with a high degree of accuracy, compensating for a relatively high error rate by means of increased coverage can be challenging for larger genomic loci. Finally, Paper IV introduces the use of barcode-linked reads on an unprecedented scale, with a new assay that enables low-cost haplotyping of whole genomes without the need for predetermined capture sequences. This technology is utilized to generate a haplotype-resolved human genome, call large-scale structural variants and perform reference-free assembly of bacterial and human genomes. At a cost of only $19 USD per sample, this technology makes the benefits of long-range haplotyping available to the vast majority of laboratories which currently rely solely on short-read sequencing platforms.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2018. p. 45
Series
TRITA-CBH-FOU ; 2018:41
Keywords
Single molecule sequencing, DNA barcoding, whole genome haplotyping, linked-read sequencing, phasing, de novo genome assembly.
National Category
Medical Biotechnology (with a focus on Cell Biology (including Stem Cell Biology), Molecular Biology, Microbiology, Biochemistry or Biopharmacy)
Research subject
Biotechnology
Identifiers
urn:nbn:se:kth:diva-235187 (URN)978-91-7729-939-4 (ISBN)
Public defence
2018-10-19, Air & Fire Auditorium, Tomtebodavägen 23, Solna, 10:00 (English)
Opponent
Supervisors
Note

QC 20180919

Available from: 2018-09-19 Created: 2018-09-19 Last updated: 2018-09-19Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Redin, DavidBorgström, ErikHe, MengxiaoAghelpasand, HoomanKäller, MaxAhmadian, Afshin
By organisation
Gene TechnologyScience for Life Laboratory, SciLifeLab
In the same journal
Nucleic Acids Research
Biochemistry and Molecular Biology

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 20 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf