Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Discovery of Novel Sequences in 1,000 Swedish Genomes
Karolinska Inst, Ctr Mol Med, Dept Mol Med & Surg, Stockholm, Sweden.;Karolinska Inst, Sci Life Lab, Sci Pk, Solna, Sweden.;Karolinska Univ Hosp, Dept Clin Genet, Stockholm, Sweden..
KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH).
Uppsala Univ, Sci Life Lab, Dept Immunol Genet & Pathol, Uppsala, Sweden..ORCID iD: 0000-0001-6085-6749
Karolinska Inst, Ctr Mol Med, Dept Mol Med & Surg, Stockholm, Sweden.;Karolinska Inst, Sci Life Lab, Sci Pk, Solna, Sweden.;Karolinska Univ Hosp, Dept Clin Genet, Stockholm, Sweden..ORCID iD: 0000-0001-5831-385X
Show others and affiliations
2020 (English)In: Molecular biology and evolution, ISSN 0737-4038, E-ISSN 1537-1719, Vol. 37, no 1, p. 18-30Article in journal (Refereed) Published
Abstract [en]

Novel sequences (NSs), not present in the human reference genome, are abundant and remain largely unexplored. Here, we utilize de novo assembly to study NS in 1,000 Swedish individuals first sequenced as part of the SweGen project revealing a total of 46 Mb in 61,044 distinct contigs of sequences not present in GRCh38. The contigs were aligned to recently published catalogs of Icelandic and Pan-African NSs, as well as the chimpanzee genome, revealing a great diversity of shared sequences. Analyzing the positioning of NS across the chimpanzee genome, we find that 2,807 NS align confidently within 143 chimpanzee orthologs of human genes. Aligning the whole genome sequencing data to the chimpanzee genome, we discover ancestral NS common throughout the Swedish population. The NSs were searched for repeats and repeat elements: revealing a majority of repetitive sequence (56%), and enrichment of simple repeats (28%) and satellites (15%). Lastly, we align the unmappable reads of a subset of the thousand genomes data to our collection of NS, as well as the previously published Pan-African NS: revealing that both the Swedish and Pan-African NS are widespread, and that the Swedish NSs are largely a subset of the Pan-African NS. Overall, these results highlight the importance of creating a more diverse reference genome and illustrate that significant amounts of the NS may be of ancestral origin.

Place, publisher, year, edition, pages
OXFORD UNIV PRESS , 2020. Vol. 37, no 1, p. 18-30
Keywords [en]
population genomics, novel sequences, de novo assembly, ancestral deletion
National Category
Genetics
Identifiers
URN: urn:nbn:se:kth:diva-270910DOI: 10.1093/molbev/msz176ISI: 000515121200004PubMedID: 31560401Scopus ID: 2-s2.0-85077539319OAI: oai:DiVA.org:kth-270910DiVA, id: diva2:1416605
Note

QC 20200324

Available from: 2020-03-24 Created: 2020-03-24 Last updated: 2020-03-24Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMedScopus

Authority records BETA

Mårtensson, Gustaf

Search in DiVA

By author/editor
Mårtensson, GustafAmeur, AdamNilsson, Daniel
By organisation
School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH)
In the same journal
Molecular biology and evolution
Genetics

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 11 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf