Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Phasing single DNA molecules with barcode linked sequencing
KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Gene Technology. (Experimental Genomics)
2018 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Elucidation of our genetic constituents has in the past decade predominately taken the form of short-read DNA sequencing. Revolutionary technology developments have enabled vast amounts of biological information to be obtained, but from a medical standpoint it has yet to live up to the promise of associating individual genotypes to phenotypic states of wide-spread clinical relevance. The mechanisms by which complex phenotypes arise have been difficult to ascertain and the value of short-read sequencing platforms have been limited in this regard. It has become evident that resolving the full spectrum of genetic heterogeneity requires accurate long range information of individual haplotypes to be distinguished. Long-range haplotyping information can be obtained experimentally by long-read sequencing platforms or through linkage of short sequencing reads by means of a common barcode. This thesis explores these solutions, primarily through the development of novel technologies to phase short sequences of single molecules using DNA barcoding. A new method for high-throughput phasing of single DNA molecules, achieved by the production and utilization of uniquely barcoded beads in emulsion droplets, is described in Paper I. The results confirm that complex libraries of beads featuring mutually exclusive barcodes can be generated through clonal PCR amplification, and that these beads can be used to phase variations of the 16s rRNA gene which reduces the ambiguity of classifying bacterial species for metagenomics. Paper II describes a second methodology (‘Droplet Barcode Sequencing’) which simplifies the concept of barcoding DNA fragments by omitting the need for beads and instead relying on clonal amplification of single barcoding oligonucleotides. This study also increases the amount of information that can be linked, which is showcased by phasing all exons of the HLA-A gene and successfully resolving all the alleles present in a sample pool of eight individuals. Paper III expands on this work and explores the use of a single molecule sequencing platform to provide full-length sequencing coverage of six genes of the HLA family. The results show that while genes shorter than 10 kb can be resolved with a high degree of accuracy, compensating for a relatively high error rate by means of increased coverage can be challenging for larger genomic loci. Finally, Paper IV introduces the use of barcode-linked reads on an unprecedented scale, with a new assay that enables low-cost haplotyping of whole genomes without the need for predetermined capture sequences. This technology is utilized to generate a haplotype-resolved human genome, call large-scale structural variants and perform reference-free assembly of bacterial and human genomes. At a cost of only $19 USD per sample, this technology makes the benefits of long-range haplotyping available to the vast majority of laboratories which currently rely solely on short-read sequencing platforms.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2018. , p. 45
Series
TRITA-CBH-FOU ; 2018:41
Keywords [en]
Single molecule sequencing, DNA barcoding, whole genome haplotyping, linked-read sequencing, phasing, de novo genome assembly.
National Category
Medical Biotechnology (with a focus on Cell Biology (including Stem Cell Biology), Molecular Biology, Microbiology, Biochemistry or Biopharmacy)
Research subject
Biotechnology
Identifiers
URN: urn:nbn:se:kth:diva-235187ISBN: 978-91-7729-939-4 (print)OAI: oai:DiVA.org:kth-235187DiVA, id: diva2:1249410
Public defence
2018-10-19, Air & Fire Auditorium, Tomtebodavägen 23, Solna, 10:00 (English)
Opponent
Supervisors
Note

QC 20180919

Available from: 2018-09-19 Created: 2018-09-19 Last updated: 2018-09-19Bibliographically approved
List of papers
1. Phasing of single DNA molecules by massively parallel barcoding
Open this publication in new window or tab >>Phasing of single DNA molecules by massively parallel barcoding
Show others...
2015 (English)In: Nature Communications, ISSN 2041-1723, E-ISSN 2041-1723, Vol. 6, article id 7173Article in journal (Refereed) Published
Abstract [en]

High-throughput sequencing platforms mainly produce short-read data, resulting in a loss of phasing information for many of the genetic variants analysed. For certain applications, it is vital to know which variant alleles are connected to each individual DNA molecule. Here we demonstrate a method for massively parallel barcoding and phasing of single DNA molecules. First, a primer library with millions of uniquely barcoded beads is generated. When compartmentalized with single DNA molecules, the beads can be used to amplify and tag any target sequences of interest, enabling coupling of the biological information from multiple loci. We apply the assay to bacterial 16S sequencing and up to 94% of the hypothesized phasing events are shown to originate from single molecules. The method enables use of widely available short-read-sequencing platforms to study long single molecules within a complex sample, without losing phase information.

National Category
Biological Sciences
Identifiers
urn:nbn:se:kth:diva-171312 (URN)10.1038/ncomms8173 (DOI)000357166400001 ()26055759 (PubMedID)2-s2.0-84931275307 (Scopus ID)
Funder
Science for Life Laboratory - a national resource center for high-throughput molecular bioscience
Note

QC 20150727

Available from: 2015-07-27 Created: 2015-07-27 Last updated: 2018-10-02Bibliographically approved
2. Droplet Barcode Sequencing for targeted linked-read haplotyping of single DNA molecules
Open this publication in new window or tab >>Droplet Barcode Sequencing for targeted linked-read haplotyping of single DNA molecules
Show others...
2017 (English)In: Nucleic Acids Research, ISSN 0305-1048, E-ISSN 1362-4962, Vol. 45, no 13, article id e125Article in journal (Refereed) Published
Abstract [en]

Data produced with short-read sequencing technologies result in ambiguous haplotyping and a limited capacity to investigate the full repertoire of biologically relevant forms of genetic variation. The notion of haplotype-resolved sequencing data has recently gained traction to reduce this unwanted ambiguity and enable exploration of other forms of genetic variation; beyond studies of just nucleotide polymorphisms, such as compound heterozygosity and structural variations. Here we describe Droplet Barcode Sequencing, a novel approach for creating linked-read sequencing libraries by uniquely barcoding the information within single DNA molecules in emulsion droplets, without the aid of specialty reagents or microfluidic devices. Barcode generation and template amplification is performed simultaneously in a single enzymatic reaction, greatly simplifying the workflow and minimizing assay costs compared to alternative approaches. The method has been applied to phase multiple loci targeting all exons of the highly variable Human Leukocyte Antigen A (HLA-A) gene, with DNA from eight individuals present in the same assay. Barcode-based clustering of sequencing reads confirmed analysis of over 2000 independently assayed template molecules, with an average of 753 reads in support of called polymorphisms. Our results show unequivocal characterization of all alleles present, validated by correspondence against confirmed HLA database entries and haplotyping results from previous studies.

Place, publisher, year, edition, pages
Oxford University Press, 2017
National Category
Biochemistry and Molecular Biology
Identifiers
urn:nbn:se:kth:diva-212628 (URN)10.1093/nar/gkx436 (DOI)000406776400008 ()2-s2.0-85026371846 (Scopus ID)
Funder
Stiftelsen Olle Engkvist Byggmästare, 2015/347Knut and Alice Wallenberg Foundation, 2011.0113Science for Life Laboratory - a national resource center for high-throughput molecular bioscience
Note

QC 20170824

Available from: 2017-08-24 Created: 2017-08-24 Last updated: 2018-09-19Bibliographically approved
3. Comprehensive haplotyping of the HLA gene family using nanopore sequencing
Open this publication in new window or tab >>Comprehensive haplotyping of the HLA gene family using nanopore sequencing
Show others...
(English)Manuscript (preprint) (Other academic)
Abstract [en]

The HLA gene family is the most polymorphic loci in the human genome; it encodes for the major histocompatibility complexes (MHC) which mediates the immune response in terms of cellular interactions with antigens. Compatibility between HLA alleles is thus of great medical interest for recipients of allogeneic transplantations. Traditional serological techniques to evaluate compatibility are now being replaced by more accurate DNA sequencing-based methods. However, short read sequencing data typically result in collapsed sequences representing a mixture of variants from native haplotypes. In addition, most previous studies have been limited to a few highly polymorphic exons of various HLA genes. Here we present haplotype-resolved full-length sequencing of the six most clinically relevant MHC Class I and Class II genes, to characterize the haplotypes of eight reference individuals, using a single MinION flow cell. The results show that full-length sequencing of single molecules enables haplotypes to be resolved to the highest degree of accuracy (four-field resolution). In this study, a majority of the alleles were classified with four-field resolution and could be verified through previously published genotyping studies. These results support the notion that nanopore sequencing could be a viable solution for highly accurate clinical evaluation of histocompatibility.

National Category
Biomedical Laboratory Science/Technology
Research subject
Biotechnology
Identifiers
urn:nbn:se:kth:diva-235225 (URN)
Funder
Stiftelsen Olle Engkvist Byggmästare, 2015/347Stockholm County Council, LS2016-0764
Note

QC 20180919

Available from: 2018-09-18 Created: 2018-09-18 Last updated: 2018-09-19Bibliographically approved
4. Efficient whole genome haplotyping and single molecule phasing with barcode-linked reads
Open this publication in new window or tab >>Efficient whole genome haplotyping and single molecule phasing with barcode-linked reads
Show others...
(English)Manuscript (preprint) (Other academic)
Abstract [en]

The future of human genomics is one that seeks to resolve the entirety of genetic variation through sequencing. The prospect of utilizing genomics for medical purposes require cost-efficient and accurate base calling, long-range haplotyping capability, and reliable calling of structural variants. Short-read sequencing has lead the development towards such a future but has struggled to meet the latter two of these needs. To address this limitation, we developed a technology that preserves the molecular origin of short sequencing reads, with an insignificant increase to sequencing costs. We demonstrate a library preparation method which enables whole genome haplotyping, long-range phasing of single DNA molecules, and de novo genome assembly through barcode-linked reads (BLR). Millions of random barcodes are used to reconstruct megabase-scale phase blocks and call structural variants. We also highlight the versatility of our technology by generating libraries from different organisms using picograms to nanograms of input material.

Keywords
Whole genome haplotyping, single molecule phasing, de novo assembly, barcode-linked reads, DNA phasing, BLR, droplet barcoding, whole genome sequencing, linked-read sequencing.
National Category
Medical Biotechnology (with a focus on Cell Biology (including Stem Cell Biology), Molecular Biology, Microbiology, Biochemistry or Biopharmacy)
Research subject
Biotechnology
Identifiers
urn:nbn:se:kth:diva-235227 (URN)
Funder
Stiftelsen Olle Engkvist Byggmästare, 2015/347
Note

QC 20180919

Available from: 2018-09-18 Created: 2018-09-18 Last updated: 2018-09-19Bibliographically approved

Open Access in DiVA

Redin_Thesis(1152 kB)70 downloads
File information
File name FULLTEXT01.pdfFile size 1152 kBChecksum SHA-512
47f684f229bf631f1d6ee9b977ec3ebb34d4df65e629c70c97a78c2b1c9bec4490fcbe44b8f3813ec70595e4b5d0c71aabc0a166a0d5460fc1b6df8343e353bc
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Redin, David
By organisation
Gene Technology
Medical Biotechnology (with a focus on Cell Biology (including Stem Cell Biology), Molecular Biology, Microbiology, Biochemistry or Biopharmacy)

Search outside of DiVA

GoogleGoogle Scholar
Total: 70 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1738 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf