Explainable Artificial Neural Network for Recurrent Venous Thromboembolism Based on Plasma ProteomicsShow others and affiliations
2021 (English)In: Computational Methods in Systems Biology19th International Conference, CMSB 2021, Bordeaux, France, September 22–24, 2021, Proceedings, Springer Science and Business Media Deutschland GmbH , 2021, p. 108-121Conference paper, Published paper (Refereed)
Abstract [en]
Venous thromboembolism (VTE) is the third most common cardiovascular disease, affecting ∼ 1,000,000 individuals each year in Europe. VTE is characterized by an annual recurrent rate of ∼ 6%, and ∼ 30% of patients with unprovoked VTE will face a recurrent event after a six-month course of anticoagulant treatment. Even if guidelines recommend life-long treatment for these patients, about ∼ 70% of them will never experience a recurrence and will receive unnecessary lifelong anti-coagulation that is associated with increased risk of bleeding and is highly costly for the society. There is then urgent need to identify biomarkers that could distinguish VTE patients with high risk of recurrence from low-risk patients. Capitalizing on a sample of 913 patients followed up for the risk of VTE recurrence during a median of ∼ 10 years and profiled for 376 plasma proteomic antibodies, we here develop an artificial neural network (ANN) based strategy to identify a proteomic signature that helps discriminating patients at low and high risk of recurrence. In a first stage, we implemented a Repeated Editing Nearest Neighbors algorithm to select a homogeneous sub-sample of VTE patients. This sub-sample was then split in a training and a testing sets. The former was used for training our ANN, the latter for testing its discriminatory properties. In the testing dataset, our ANN led to an accuracy of 0.86 that compared to an accuracy of 0.79 as provided by a random forest classifier. We then applied a Deep Learning Important FeaTures (DeepLIFT) – based approach to identify the variables that contribute the most to the ANN predictions. In addition to sex, the proposed DeepLIFT strategy identified 6 important proteins (DDX1, HTRA3, LRG1, MAST2, NFATC4 and STXBP5) whose exact roles in the etiology of VTE recurrence now deserve further experimental validations.
Place, publisher, year, edition, pages
Springer Science and Business Media Deutschland GmbH , 2021. p. 108-121
Keywords [en]
Artificial neural network, Imbalanced, Interpretation, Proteomics, Thrombosis, Classification (of information), Decision trees, Molecular biology, Patient treatment, Proteins, Recurrent neural networks, Statistical tests, Anti-coagulation, Bleedings, Cardiovascular disease, Important features, Recurrent events, Sub-samples, Venous thromboembolism, Diseases
National Category
Cardiology and Cardiovascular Disease Hematology
Identifiers
URN: urn:nbn:se:kth:diva-312047DOI: 10.1007/978-3-030-85633-5_7ISI: 001351063100007Scopus ID: 2-s2.0-85116070205OAI: oai:DiVA.org:kth-312047DiVA, id: diva2:1658236
Conference
International Conference on Computational Methods in Systems Biology, 22 September 2021 through 24 September 2021
Note
Part of proceedings: ISBN 978-3-030-85632-8
QC 20220516
2022-05-162022-05-162025-12-05Bibliographically approved