Inferring species membership using DNA sequences with back-propagation neural networks
2008 (English)In: Systematic Biology, ISSN 1063-5157, E-ISSN 1076-836X, Vol. 57, no 2, p. 202-215Article in journal (Refereed) Published
Abstract [en]
DNA barcoding as a method for species identification is rapidly increasing in popularity. However, there are still relatively few rigorous methodological tests of DNA barcoding. Current distance-based methods are frequently criticized for treating the nearest neighbor as the closest relative via a raw similarity score, lacking an objective set of criteria to delineate taxa, or for being incongruent with classical character-based taxonomy. Here, we propose an artificial intelligence-based approachinferring species membership via DNA barcoding with back-propagation neural networks (named BP-based species identification)as a new advance to the spectrum of available methods. We demonstrate the value of this approach with simulated data sets representing different levels of sequence variation under coalescent simulations with various evolutionary models, as well as with two empirical data sets of COI sequences from East Asian ground beetles (Carabidae) and Costa Rican skipper butterflies. With a 630-to 690-bp fragment of the COI gene, we identified 97.50% of 80 unknown sequences of ground beetles, 95.63%, 96.10%, and 100% of 275, 205, and 9 unknown sequences of the neotropical skipper butterfly to their correct species, respectively. Our simulation studies indicate that the success rates of species identification depend on the divergence of sequences, the length of sequences, and the number of reference sequences. Particularly in cases involving incomplete lineage sorting, this new BP-based method appears to be superior to commonly used methods for DNA-based species identification.
Place, publisher, year, edition, pages
2008. Vol. 57, no 2, p. 202-215
Keywords [en]
back-propagation, DNA barcoding, incomplete lineage sorting, neural networks, species identification
National Category
Cell Biology
Identifiers
URN: urn:nbn:se:kth:diva-101870DOI: 10.1080/10635150802032982ISI: 000255690200002OAI: oai:DiVA.org:kth-101870DiVA, id: diva2:553406
Note
QC 20120919
2012-09-192012-09-052017-12-07Bibliographically approved