Change search
ReferencesLink to record
Permanent link

Direct link
Motif Yggdrasil: Sampling sequence motifs from a tree mixture model
KTH, School of Computer Science and Communication (CSC), Computational Biology, CB.
2007 (English)In: Journal of Computational Biology, ISSN 1066-5277, E-ISSN 1557-8666, Vol. 14, no 5, 682-697 p.Article in journal (Refereed) Published
Abstract [en]

In phylogenetic foot-printing, putative regulatory elements are found in upstream regions of orthologous genes by searching for common motifs. Motifs in different upstream sequences are subject to mutations along the edges of the corresponding phylogenetic tree, consequently taking advantage of the tree in the motif search is an appealing idea. We describe the Motif Yggdrasil sampler; the first Gibbs sampler based on a general tree that uses unaligned sequences. Previous tree-based Gibbs samplers have assumed a star-shaped tree or partially aligned upstream regions. We give a probabilistic model (MY model) describing upstream sequences with regulatory elements and build a Gibbs sampler with respect to this model. The model allows toggling, i.e., the restriction of a position to a subset of nucleotides, but does not require aligned sequences nor edge lengths, which may be difficult to come by. We apply the collapsing technique to eliminate the need to sample nuisance parameters, and give a derivation of the predictive update formula. We show that the MY model improves the modeling of difficult motif instances and that the use of the tree achieves a substantial increase in nucleotide level correlation coefficient both for synthetic data and 37 bacterial lexA genes. We investigate the sensitivity to errors in the tree and show that using random trees MY sampler still has a performance similar to the original version.

Place, publisher, year, edition, pages
2007. Vol. 14, no 5, 682-697 p.
Keyword [en]
Gibbs sampling, phylogenetic footprinting, regulatory element, transcription factor binding site identification probabilistic modeling, factor-binding sites, regulatory elements, evolution, algorithms, discovery, alignment, matrices
URN: urn:nbn:se:kth:diva-16779DOI: 10.1089/cmb.2007.R010ISI: 000247927100011ScopusID: 2-s2.0-34447273158OAI: diva2:334822
QC 20100525Available from: 2010-08-05 Created: 2010-08-05 Last updated: 2010-09-20Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Lagergren, Jens
By organisation
Computational Biology, CB
In the same journal
Journal of Computational Biology

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 36 hits
ReferencesLink to record
Permanent link

Direct link