Motif Yggdrasil: Sampling from a tree mixture model
2006 (English)In: Research In Computational Molecular Biology, Proceedings / [ed] Apostolico, A; Guerra, C; Istrail, S; Pevzner, P; Waterman, M, 2006, Vol. 3909, 458-472 p.Conference paper (Refereed)
In phylogenetic foot-printing, putative regulatory elements are found in upstream regions of orthologous genes by searching for common motifs. Motifs in different upstream sequences are subject to mutations along the edges of the corresponding phylogenetic tree, consequently taking advantage of the tree in the motif search is an appealing idea. We describe the Motif Yggdrasil sampler; the first Gibbs sampler based on a general tree that uses unaligned sequences. Previous tree-based Gibbs samplers have assumed a star-shaped tree or partially aligned upstream regions. We give a probabilistic model describing upstream sequences with regulatory elements and build a Gibbs sampler with respect to this model. We apply the collapsing technique to eliminate the need to sample nuisance parameters, and give a derivation of the predictive update formula. The use of the tree achieves a substantial increase in nucleotide level correlation coefficient both for synthetic data and 37 bacterial lexA genes.
Place, publisher, year, edition, pages
2006. Vol. 3909, 458-472 p.
, Lecture Notes in Computer Science, ISSN 0302-9743 ; 3909
Bioinformatics (Computational Biology)
IdentifiersURN: urn:nbn:se:kth:diva-42028DOI: 10.1007/11732990_39ISI: 000236991800039ScopusID: 2-s2.0-33745769012ISBN: 3-540-33295-2OAI: oai:DiVA.org:kth-42028DiVA: diva2:446077
10th Annual International Conference on Research in Computational Molecular Biology Location: Venice, Italy, Date: APR 02-05, 2006
QC 201110062011-10-062011-10-052011-10-06Bibliographically approved