Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
The use of grid computing to drive data-intensive genetic research
KTH, Skolan för bioteknologi (BIO), Genteknologi.
KTH, Skolan för bioteknologi (BIO), Genteknologi.
Visa övriga samt affilieringar
2007 (Engelska)Ingår i: European Journal of Human Genetics, ISSN 1018-4813, E-ISSN 1476-5438, Vol. 15, nr 6, s. 694-702Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

In genetics, with increasing data sizes and more advanced algorithms for mining complex data, a point is reached where increased computational capacity or alternative solutions becomes unavoidable. Most contemporary methods for linkage analysis are based on the Lander-Green hidden Markov model (HMM), which scales exponentially with the number of pedigree members. In whole genome linkage analysis, genotype simulations become prohibitively time consuming to perform on single computers. We have developed 'Grid-Allegro', a Grid aware implementation of the Allegro software, by which several thousands of genotype simulations can be performed in parallel in short time. With temporary installations of the Allegro executable and datasets on remote nodes at submission, the need of predefined Grid run-time environments is circumvented. We evaluated the performance, efficiency and scalability of this implementation in a genome scan on Swedish multiplex Alzheimer's disease families. We demonstrate that 'Grid-Allegro' allows for the full exploitation of the features available in Allegro for genome-wide linkage. The implementation of existing bioinformatics applications on Grids (Distributed Computing) represent a cost-effective alternative for addressing highly resource-demanding and data-intensive bioinformatics task, compared to acquiring and setting up clusters of computational hardware in house (Parallel Computing), a resource not available to most geneticists today.

Ort, förlag, år, upplaga, sidor
2007. Vol. 15, nr 6, s. 694-702
Nyckelord [en]
grid, bioinformatics, genome-wide, linkage analysis, genotype simulation
Nationell ämneskategori
Bioinformatik och systembiologi
Identifikatorer
URN: urn:nbn:se:kth:diva-7797DOI: 10.1038/sj.ejhg.5201815ISI: 000246792100012Scopus ID: 2-s2.0-34249727262OAI: oai:DiVA.org:kth-7797DiVA, id: diva2:12926
Anmärkning
QC 20101004Tillgänglig från: 2007-12-10 Skapad: 2007-12-10 Senast uppdaterad: 2017-12-14Bibliografiskt granskad
Ingår i avhandling
1. Grid and High-Performance Computing for Applied Bioinformatics
Öppna denna publikation i ny flik eller fönster >>Grid and High-Performance Computing for Applied Bioinformatics
2007 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

The beginning of the twenty-first century has been characterized by an explosion of biological information. The avalanche of data grows daily and arises as a consequence of advances in the fields of molecular biology and genomics and proteomics. The challenge for nowadays biologist lies in the de-codification of this huge and complex data, in order to achieve a better understanding of how our genes shape who we are, how our genome evolved, and how we function.

Without the annotation and data mining, the information provided by for example high throughput genomic sequencing projects is not very useful. Bioinformatics is the application of computer science and technology to the management and analysis of biological data, in an effort to address biological questions. The work presented in this thesis has focused on the use of Grid and High Performance Computing for solving computationally expensive bioinformatics tasks, where, due to the very large amount of available data and the complexity of the tasks, new solutions are required for efficient data analysis and interpretation.

Three major research topics are addressed; First, the use of grids for distributing the execution of sequence based proteomic analysis, its application in optimal epitope selection and in a proteome-wide effort to map the linear epitopes in the human proteome. Second, the application of grid technology in genetic association studies, which enabled the analysis of thousand of simulated genotypes, and finally the development and application of a economic based model for grid-job scheduling and resource administration.

The applications of the grid based technology developed in the present investigation, results in successfully tagging and linking chromosomes regions in Alzheimer disease, proteome-wide mapping of the linear epitopes, and the development of a Market-Based Resource Allocation in Grid for Scientific Applications.

Ort, förlag, år, upplaga, sidor
Stockholm: KTH, 2007
Serie
Trita-BIO-Report, ISSN 1654-2312 ; 2007:9
Nyckelord
Grid computing, bioinformatics, genomics, proteomics
Nationell ämneskategori
Bioinformatik (beräkningsbiologi)
Identifikatorer
urn:nbn:se:kth:diva-4573 (URN)978-91-7178-782-8 (ISBN)
Disputation
2007-12-21, FD5, AlbaNova, oslagstullsbacken 21, Stockholm, 10:00
Opponent
Handledare
Anmärkning
QC 20100622Tillgänglig från: 2007-12-10 Skapad: 2007-12-10 Senast uppdaterad: 2018-01-13Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextScopus

Sök vidare i DiVA

Av författaren/redaktören
Andrade, JorgeAndersen, MalinOdeberg, Jacob
Av organisationen
Genteknologi
I samma tidskrift
European Journal of Human Genetics
Bioinformatik och systembiologi

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetricpoäng

doi
urn-nbn
Totalt: 597 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf