Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
VMCMC: a graphical and statistical analysis tool for Markov chain Monte Carlo traces
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST). (Lars Arvestad)ORCID-id: 0000-0003-0539-3491
KTH, Skolan för informations- och kommunikationsteknik (ICT).
KTH, Skolan för informations- och kommunikationsteknik (ICT).
KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).ORCID-id: 0000-0002-6664-1607
Vise andre og tillknytning
(engelsk)Manuskript (preprint) (Annet vitenskapelig)
Abstract [en]

Motivation: MCMC-based methods are important for Bayesian inference of phylogeny and related parameters. Although being computationally expensive, MCMC yields estimates of posterior distributions that are useful for estimating parameter values and are easy to use in subsequent analysis. There are, however, sometimes practical diculties with MCMC, relating to convergence assessment and determining burn-in, especially in large-scale analyses. Currently, multiple software are required to perform, e.g., convergence, mixing and interactive exploration of both continuous and tree parameters.

Results: We have written a software called VMCMC to simplify post-processing of MCMC traces with, for example, automatic burn-in estimation. VMCMC can also be used both as a GUI-based application, supporting interactive exploration, and as a command-line tool suitable for automated pipelines.

Availability: VMCMC is available for Java SE 6+ under the New BSD License. Executable jar les, tutorial manual and source code can be downloaded from https://bitbucket.org/rhali/visualmcmc/.

Emneord [en]
Markov chain Monte Carlo
HSV kategori
Forskningsprogram
Datalogi
Identifikatorer
URN: urn:nbn:se:kth:diva-180543OAI: oai:DiVA.org:kth-180543DiVA, id: diva2:895237
Merknad

QS 2016

Tilgjengelig fra: 2016-01-18 Laget: 2016-01-18 Sist oppdatert: 2016-02-01bibliografisk kontrollert
Inngår i avhandling
1. From genomes to post-processing of Bayesian inference of phylogeny
Åpne denne publikasjonen i ny fane eller vindu >>From genomes to post-processing of Bayesian inference of phylogeny
2016 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

Life is extremely complex and amazingly diverse; it has taken billions of years of evolution to attain the level of complexity we observe in nature now and ranges from single-celled prokaryotes to multi-cellular human beings. With availability of molecular sequence data, algorithms inferring homology and gene families have emerged and similarity in gene content between two genes has been the major signal utilized for homology inference. Recently there has been a significant rise in number of species with fully sequenced genome, which provides an opportunity to investigate and infer homologs with greater accuracy and in a more informed way. Phylogeny analysis explains the relationship between member genes of a gene family in a simple, graphical and plausible way using a tree representation. Bayesian phylogenetic inference is a probabilistic method used to infer gene phylogenies and posteriors of other evolutionary parameters. Markov chain Monte Carlo (MCMC) algorithm, in particular using Metropolis-Hastings sampling scheme, is the most commonly employed algorithm to determine evolutionary history of genes. There are many softwares available that process results from each MCMC run, and explore the parameter posterior but there is a need for interactive software that can analyse both discrete and real-valued parameters, and which has convergence assessment and burnin estimation diagnostics specifically designed for Bayesian phylogenetic inference.

In this thesis, a synteny-aware approach for gene homology inference, called GenFamClust (GFC), is proposed that uses gene content and gene order conservation to infer homology. The feature which distinguishes GFC from earlier homology inference methods is that local synteny has been combined with gene similarity to infer homologs, without inferring homologous regions. GFC was validated for accuracy on a simulated dataset. Gene families were computed by applying clustering algorithms on homologs inferred from GFC, and compared for accuracy, dependence and similarity with gene families inferred from other popular gene family inference methods on a eukaryotic dataset. Gene families in fungi obtained from GFC were evaluated against pillars from Yeast Gene Order Browser. Genome-wide gene families for some eukaryotic species are computed using this approach.

Another topic focused in this thesis is the processing of MCMC traces for Bayesian phylogenetics inference. We introduce a new software VMCMC which simplifies post-processing of MCMC traces. VMCMC can be used both as a GUI-based application and as a convenient command-line tool. VMCMC supports interactive exploration, is suitable for automated pipelines and can handle both real-valued and discrete parameters observed in a MCMC trace. We propose and implement joint burnin estimators that are specifically applicable to Bayesian phylogenetics inference. These methods have been compared for similarity with some other popular convergence diagnostics. We show that Bayesian phylogenetic inference and VMCMC can be applied to infer valuable evolutionary information for a biological case – the evolutionary history of FERM domain.

sted, utgiver, år, opplag, sider
Stockholm: KTH Royal Institute of Technology, 2016. s. viii, 65
Serie
TRITA-CSC-A, ISSN 1653-5723 ; 2016:01
Emneord
Bayesian inference
HSV kategori
Forskningsprogram
Datalogi
Identifikatorer
urn:nbn:se:kth:diva-181319 (URN)978-91-7595-849-1 (ISBN)
Disputas
2016-02-25, Fire, Tomtebodavägen 23, 171 65, Solna, 14:00 (engelsk)
Opponent
Veileder
Merknad

QC 20160201

Tilgjengelig fra: 2016-02-01 Laget: 2016-01-31 Sist oppdatert: 2018-01-10bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Personposter BETA

Ali, Raja HashimMuhammad, Sayyed AuwnArvestad, Lars

Søk i DiVA

Av forfatter/redaktør
Ali, Raja HashimBark, MikaelMiro, JorgeMuhammad, Sayyed AuwnSjöstrand, JoelZubair, Syed MuhammadArvestad, Lars
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric

urn-nbn
Totalt: 1158 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf