Riemannian geometry for efficient analysis of protein dynamics dataShow others and affiliations
2024 (English)In: Proceedings of the National Academy of Sciences of the United States of America, ISSN 0027-8424, E-ISSN 1091-6490, Vol. 121, no 33, article id e2318951121Article in journal (Refereed) Published
Abstract [en]
An increasingly common viewpoint is that protein dynamics datasets reside in a nonlinear subspace of low conformational energy. Ideal data analysis tools should therefore account for such nonlinear geometry. The Riemannian geometry setting can be suitable for a variety of reasons. First, it comes with a rich mathematical structure to account for a wide range of geometries that can be modeled after an energy landscape. Second, many standard data analysis tools developed for data in Euclidean space can be generalized to Riemannian manifolds. In the context of protein dynamics, a conceptual challenge comes from the lack of guidelines for constructing a smooth Riemannian structure based on an energy landscape. In addition, computational feasibility in computing geodesics and related mappings poses a major challenge. This work considers these challenges. The first part of the paper develops a local approximation technique for computing geodesics and related mappings on Riemannian manifolds in a computationally feasible manner. The second part constructs a smooth manifold and a Riemannian structure that is based on an energy landscape for protein conformations. The resulting Riemannian geometry is tested on several data analysis tasks relevant for protein dynamics data. In particular, the geodesics with given start- and end-points approximately recover corresponding molecular dynamics trajectories for proteins that undergo relatively ordered transitions with medium-sized deformations. The Riemannian protein geometry also gives physically realistic summary statistics and retrieves the underlying dimension even for large-sized deformations within seconds on a laptop.
Place, publisher, year, edition, pages
Proceedings of the National Academy of Sciences , 2024. Vol. 121, no 33, article id e2318951121
Keywords [en]
dimension reduction, interpolation, manifold-valued data, protein dynamics, Riemannian manifold
National Category
Geometry Computational Mathematics
Identifiers
URN: urn:nbn:se:kth:diva-366654DOI: 10.1073/pnas.2318951121PubMedID: 39121160Scopus ID: 2-s2.0-85201064425OAI: oai:DiVA.org:kth-366654DiVA, id: diva2:1982785
Note
QC 20250708
2025-07-082025-07-082025-07-08Bibliographically approved