Change search
ReferencesLink to record
Permanent link

Direct link
SNP discovery using advanced algorithms and neural networks
KTH, School of Biotechnology (BIO).
KTH, School of Biotechnology (BIO).
KTH, School of Biotechnology (BIO).ORCID iD: 0000-0003-3281-8088
2005 (English)In: Bioinformatics, ISSN 1367-4803, E-ISSN 1460-2059, Vol. 21, no 10, 2528-2530 p.Article in journal (Refereed) Published
Abstract [en]

Forage is an application which uses two neural networks for detecting single nucleotide polymorphisms (SNPs). Potential SNP candidates are identified in multiple alignments. Each candidate is then represented by a vector of features, which is classified as SNP or monomorphic by the networks. A validated dataset of SNPs was constructed from experimentally verified SNP data and used for network training and method evalutation.

Place, publisher, year, edition, pages
2005. Vol. 21, no 10, 2528-2530 p.
Keyword [en]
access to information, algorithm, article, artificial neural network, forage, gene frequency, information processing, priority journal, single nucleotide polymorphism, statistical analysis, validation process
National Category
Industrial Biotechnology
URN: urn:nbn:se:kth:diva-5172DOI: 10.1093/bioinformatics/bti354ISI: 000229285600053PubMedID: 15746291ScopusID: 2-s2.0-19544386177OAI: diva2:7978
QC 20100929. Uppdaterad från Manuskript till Artikel (20100929). Tidigare titel: "SNP discovery usin advanced algorithms and nuural networks".Available from: 2004-10-08 Created: 2004-10-08 Last updated: 2010-09-29Bibliographically approved
In thesis
1. Computational approaches for in-depth analysis of cDNA sequence tags
Open this publication in new window or tab >>Computational approaches for in-depth analysis of cDNA sequence tags
2004 (English)Doctoral thesis, comprehensive summary (Other scientific)
Abstract [en]

Major recent improvements in biotechnology have led to an accelerated production of DNA sequences. The completion of the human genome sequence, along with the genomes of more than two hundred other species, has marked the arrival of the genome era. The ultimate goal is to understand the structure and function of genomes and their genes. This thesis has focused on the computational analysis of complementary DNA (cDNA) sequences. These are copies of mRNA transcripts that correspond to the coding regions of genomes.

Studying the expression patterns of genes is essential for understanding gene function. Many gene expression profiling techniques generate short sequence tags that derive from transcripts. A pilot study was performed to assess the feasibility of using the pyrosequencing platform for gene expression analysis. The sequences generated by pyrosequencing in most cases (≈ 85%) were long enough (> 18 nucleotides) to uniquely identify the corresponding transcripts through database searches. Aspects of transcript identification by short sequence tags were further investigated in a number of public databases, revealing that a tag length 16-17 nucleotides was sufficient for unique identifi- cation.

Longer transcript representations are obtained from expressed sequence tag (EST) sequencing. Method development for the analysis and maintenance of large EST data sets has been performed on data from poplar, which is a tree of commercial interest to the forest biotechnology industry. In 2003 a large ESTsequencing project reached > 100 000 reads, providing a unique resource for tree biology research. ESTs have been grouped into clusters and singletons that represent potential genes. Preliminary analyses have estimated gene content in Populus to be very similar to that of model organism Arabidopsis thaliana.

EST data collections provide a rich source for mining polymorphisms. A software application was developed and applied to EST data from two Populus species, and candidate single nucleotide polymorphisms (SNPs) were recorded. A study of genetic variation between the species revealed a striking similarity, with orthologous pairs being > 98% identical on the protein level.

Keywords: cDNA, EST, gene expression, SNP, SAGE, polymorphism, assembly, clustering, DNA sequencing, pyrosequencing, mRNA transcript, orthology, tree biotechnology, restriction enzyme

Place, publisher, year, edition, pages
Bioteknologi, 2004
Biotechnology, cDNA, EST, gene expression, SNP, SAGE, polymorphism, Bioteknik
National Category
Industrial Biotechnology
urn:nbn:se:kth:diva-23 (URN)91-7283-837-X (ISBN)
Public defence
2004-10-08, E1, KTH, Lindstedsvägen 3, Stockholm, 13:00
Available from: 2004-10-08 Created: 2004-10-08 Last updated: 2012-03-21Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textPubMedScopus

Search in DiVA

By author/editor
Unneberg, PerStrömberg, MichaelSterky, Fredrik
By organisation
School of Biotechnology (BIO)
In the same journal
Industrial Biotechnology

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 38 hits
ReferencesLink to record
Permanent link

Direct link