Change search
ReferencesLink to record
Permanent link

Direct link
Statistical models of TF/DNA interaction
KTH, School of Computer Science and Communication (CSC), Computational Biology, CB.
2008 (English)Licentiate thesis, comprehensive summary (Other scientific)
Abstract [en]

Gene expression is regulated in response to metabolic necessities and environmental changes throughout the life of a cell.

A major part of this regulation is governed at the level of transcription, deciding whether messengers to specific genes are produced or not.

This decision is triggered by the action of transcription factors, proteins which interact with specific sites on DNA and thus influence the rate of transcription of proximal genes.

Mapping the organisation of these transcription factor binding sites sheds light on potential causal relations between genes and is the key to establishing networks of genetic interactions, which determine how the cell adapts to external changes.

In this work I review briefly the basics of genetics and summarise popular approaches to describe transcription factor binding sites, from the most straight forward to finally discuss a biophysically motivated representation based on the estimation of free energies of molecular interactions.

Two articles on transcription factors are contained in this thesis, one published (Aurell, Fouquier d'Hérouël, Malmnäs and Vergassola, 2007) and one submitted (Fouquier d'Hérouël, 2008).

Both rely strongly on the representation of binding sites by matrices accounting for the affinity of the proteins to specific nucleotides at the different positions of the binding sites.

The importance of non-specific binding of transcription factors to DNA is briefly addressed in the text and extensively discussed in the first appended article:

In a study on the affinity of yeast transcription factors for their binding sites, we conclude that measured in vivo protein concentrations are marginally sufficient to guarantee the occupation of functional sites, as opposed to unspecific emplacements on the genomic sequence.

A common task being the inference of binding site motifs, the most common statistical method is reviewed in detail, upon which I constructed an alternative biophysically motivated approach, exemplified in the second appended article.

Place, publisher, year, edition, pages
Stockholm: KTH , 2008. , x, 41 p.
Trita-CSC-A, ISSN 1653-5723 ; 2008:01
Keyword [en]
gene expression, regulation, transcription factor, binding motif, matrix representations, gibbs sampling, binding affinity, non-specific binding
National Category
Condensed Matter Physics
URN: urn:nbn:se:kth:diva-4633ISBN: 978-91-7178-874-0OAI: diva2:13168
2008-02-29, RB 35, Roslagstullsbacken 35, Stockholm, 13:00
QC 20101110Available from: 2008-02-12 Created: 2008-02-12 Last updated: 2010-11-10Bibliographically approved
List of papers
1. Transcription factor concentrations versus binding site affinities in the yeast S. cerevisiae
Open this publication in new window or tab >>Transcription factor concentrations versus binding site affinities in the yeast S. cerevisiae
2007 (English)In: Physical Biology, ISSN 1478-3975, Vol. 4, no 2, 134-143 p.Article in journal (Refereed) Published
Abstract [en]

Transcription regulation is largely governed by the profile and the dynamics of transcription factors' binding to DNA. Stochastic effects are intrinsic to this dynamics, and the binding to functional sites must be controlled with a certain specificity for living organisms to be able to elicit specific cellular responses. Specificity stems here from the interplay between binding affinity and cellular abundance of transcription factor proteins, and the binding of such proteins to DNA is thus controlled by their chemical potential. We combine large-scale protein abundance data in the budding yeast with binding affinities for all transcription factors with known DNA binding site sequences to assess the behavior of their chemical potentials in an exponential growth phase. A sizable fraction of transcription factors is apparently bound non-specifically to DNA, and the observed abundances are marginally sufficient to ensure high occupations of the functional sites. We argue that a biological cause of this feature is related to its noise-filtering consequences: abundances below physiological levels do not yield significant binding of functional targets and mis-expressions of regulated genes may thus be tamed.

National Category
Condensed Matter Physics
urn:nbn:se:kth:diva-7972 (URN)10.1088/1478-3975/4/2/006 (DOI)000247779500007 ()2-s2.0-34447279073 (ScopusID)
QC 20101005. Tidigare titel:"Transcription factor concentrations versus binding site affinities in the yeast S. cerevisiae. Numbers and affinity".Available from: 2008-02-12 Created: 2008-02-12 Last updated: 2011-03-17Bibliographically approved
2. QPS - quadratic programming sampler: a motif finder using biophysical modeling
Open this publication in new window or tab >>QPS - quadratic programming sampler: a motif finder using biophysical modeling
2008 (English)Article in journal (Other academic) Submitted
Abstract [en]

We present a Markov chain Monte Carlo algorithm for local alignments of nucleotide sequences aiming to infer putative transcription factor binding sites, referred to as the quadratic programming sampler. The new motif nder incorporates detailed biophysical modeling of the transcription factor binding site recognition which arises an intrinsic threshold discriminating putative binding sites from other/background sequences.

We validate the principal functioning of the algorithm on a sample of four promoter regions from Escherichia coli. The resulting description of the motif can be readily evaluated on the whole genome to identify new putative binding sites.

Transcription Factor Protein, Binding Site Inference, Energy Matrix, MCMC
National Category
Condensed Matter Physics
urn:nbn:se:kth:diva-7973 (URN)
QS 20120314Available from: 2008-02-12 Created: 2008-02-12 Last updated: 2012-03-14Bibliographically approved

Open Access in DiVA

fulltext(2408 kB)371 downloads
File information
File name FULLTEXT01.pdfFile size 2408 kBChecksum MD5
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Fouquier d'Herouel, Aymeric
By organisation
Computational Biology, CB
Condensed Matter Physics

Search outside of DiVA

GoogleGoogle Scholar
Total: 371 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 435 hits
ReferencesLink to record
Permanent link

Direct link