A Structural EM Algorithm for Learning Oncogenetic Networks by Reducing to MILP
(English)Manuscript (preprint) (Other academic)
Data that is obtained from assaying cancer tumors is cross-sectional, i.e., it does not contain information about temporal order and patterns of accumulation of mutations in the tumors. Learning progression patterns of cancer is important for understanding the disease. Also, in a realistic model of cancer progression, the issue of experimental errors must be taken into account. Here, the experimental errors are modeled by introducing hidden variables. The well-know structural EM algorithm is used for learning Bayesian networks from incomplete data. The selection of parents in the E-step of this algorithm is usually performed using a greedy heuristics. Unfortunately, the E-step also involves making inference in the present Bayesian network, which is #P-complete. There are ecient algorithms for performing exact inference in bounded tree-width Bayesian networks. In order to use them, we developed an algorithm for learning bounded tree-width Bayesian networks . In the E-step, we obtain a globally optimal solution over dependence structure of bounded treewidth and parameters. That is, we obtain a Global Structural EM algorithm for this problem. Finally, we test our algorithm both on synthetic data and cancer data from renal cell carcinoma and show that it also performs well in practice.
Expectation maximisation, Bayesian networks, MILP, Cancer
Bioinformatics and Systems Biology
IdentifiersURN: urn:nbn:se:kth:diva-121703OAI: oai:DiVA.org:kth-121703DiVA: diva2:619428
QS 20132013-05-032013-05-032013-05-03Bibliographically approved