kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
HAPP: High-accuracy pipeline for processing deep metabarcoding data
Stockholm Univ, Dept Biochem & Biophys, Sci Life Lab, Natl Bioinformat Infrastruct Sweden, Solna, Sweden.
Swedish Museum Nat Hist, Dept Bioinformat & Genet, Stockholm, Sweden.
Swedish Museum Nat Hist, Dept Bioinformat & Genet, Stockholm, Sweden.
Lund Univ, Dept Lab Med, Sci Life Lab, Natl Bioinformat Infrastruct Sweden, Lund, Sweden.
Show others and affiliations
2025 (English)In: PloS Computational Biology, ISSN 1553-734X, E-ISSN 1553-7358, Vol. 21, no 11, article id e1013558Article in journal (Refereed) Published
Abstract [en]

Deep metabarcoding offers an efficient and reproducible approach to biodiversity monitoring, but noisy data and incomplete reference databases challenge accurate diversity estimation and taxonomic annotation. Here, we introduce a novel algorithm, NEEAT, for removing spurious operational taxonomic units (OTUs) originating from nuclear-embedded mitochondrial DNA sequences (NUMTs) or sequencing errors. It integrates 'echo' signals across samples with the identification of unusual evolutionary patterns among similar DNA sequences. We also extensively benchmark current tools for chimera removal, taxonomic annotation and OTU clustering of deep metabarcoding data. The best performing tools/parameter settings are integrated into HAPP, a high-accuracy pipeline for processing deep metabarcoding data. Tests using CO1 data from BOLD and large-scale metabarcoding data on insects demonstrate that HAPP significantly outperforms existing methods, while enabling efficient analysis of extensive datasets by parallelizing computations across taxonomic groups.

Place, publisher, year, edition, pages
Public Library of Science (PLoS) , 2025. Vol. 21, no 11, article id e1013558
National Category
Bioinformatics and Computational Biology
Identifiers
URN: urn:nbn:se:kth:diva-375535DOI: 10.1371/journal.pcbi.1013558ISI: 001609505600001PubMedID: 41202092Scopus ID: 2-s2.0-105022268948OAI: oai:DiVA.org:kth-375535DiVA, id: diva2:2031277
Note

QC 20260122

Available from: 2026-01-22 Created: 2026-01-22 Last updated: 2026-01-22Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMedScopus

Authority records

Andersson, Anders F.

Search in DiVA

By author/editor
Andersson, Anders F.
By organisation
Science for Life Laboratory, SciLifeLabGene Technology
In the same journal
PloS Computational Biology
Bioinformatics and Computational Biology

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 4 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf