A fragment library based on gaussian mixtures predicting favorable molecular interactions
2001 (English)In: Journal of Molecular Biology, ISSN 0022-2836, E-ISSN 1089-8638, Vol. 313, no 1, 197-214 p.Article in journal (Refereed) Published
Here, a protein atom-ligand fragment interaction library is described. The library is based on experimentally solved structures of protein-ligand and protein-protein complexes deposited in the Protein Data Bank (PDB) and it is able to characterize binding sites given a ligand structure suitable for a protein. A set of 30 ligand fragment types were defined to include three or more atoms in order to unambiguously define a frame of referencefor interactions of ligand atoms with their receptor proteins. Interactions between ligand fragments and 24 classes of protein target atoms plus a water oxygen atom were collected and segregated according to type. The spatial distributions of individual fragment - target atom pairs were visually inspected in order to obtain rough-grained constraints on the interaction volumes. Data fulfilling these constraints were given as input to an iterative expectation-maximization algorithm that produces as output maximum likelihood estimates of the parameters of the finite Gaussian mixture models. Concepts of statistical pattern recognition and the resulting mixture model densities are used (i) to predict the detailed interactions between Chlorella virus DNA ligase and the adenine ring of its ligand and (ii) to evaluate the "error" in prediction for both the training and validation sets of protein-ligand interaction found in the PDB. These analyses demonstrate that this approach can successfully narrow down the possibilities for both the interacting protein atom type and its location relative to a ligand fragment.
Place, publisher, year, edition, pages
2001. Vol. 313, no 1, 197-214 p.
protein-ligand recognition, prior and conditional probabilities, Bayes', theorem, Gaussian mixture model, expectation-maximization algorithm, protein-ligand interactions, hydrogen-bonding regions, directed drug, design, binding-sites, stochastic complexity, scoring function, probe, groups, ludi, information, positions
IdentifiersURN: urn:nbn:se:kth:diva-21056ISI: 000171816800015OAI: oai:DiVA.org:kth-21056DiVA: diva2:339753
QC 201005252010-08-102010-08-10Bibliographically approved