Fast pseudolikelihood maximization for direct-coupling analysis of protein structure from many homologous amino-acid sequences
2014 (English)In: Journal of Computational Physics, ISSN 0021-9991, E-ISSN 1090-2716, Vol. 276, 341-356 p.Article in journal (Refereed) Published
Direct-coupling analysis is a group of methods to harvest information about coevolving residues in a protein family by learning a generative model in an exponential family from data. In protein families of realistic size, this learning can only be done approximately, and there is a trade-off between inference precision and computational speed. We here show that an earlier introduced l(2)-regularized pseudolikelihood maximization method called plmDCA can be modified as to be easily parallelizable, as well as inherently faster on a single processor, at negligible difference in accuracy. We test the new incarnation of the method on 143 protein family/structure-pairs from the Protein Families database (PFAM), one of the larger tests of this class of algorithms to date.
Place, publisher, year, edition, pages
2014. Vol. 276, 341-356 p.
Contact map, Direct-coupling analysis, Inference, Potts model, Protein structure prediction, Pseudolikelihood
Bioinformatics (Computational Biology)
IdentifiersURN: urn:nbn:se:kth:diva-152551DOI: 10.1016/j.jcp.2014.07.024ISI: 000341310100014ScopusID: 2-s2.0-84905637666OAI: oai:DiVA.org:kth-152551DiVA: diva2:752712
QC 201410062014-10-062014-09-292014-10-06Bibliographically approved