kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Orthogonal Mixture of Hidden Markov Models
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS. K.
Univ Western Ontario, Dept Stat & Actuarial Sci, London, ON, Canada..
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0000-0001-8382-0300
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST). KTH, Centres, Science for Life Laboratory, SciLifeLab.ORCID iD: 0000-0002-4552-0240
2021 (English)In: Machine learning and knowledge discovery in databases, ECML PKDD 2020, pt i / [ed] Hutter, F Kersting, K Lijffijt, J Valera, I, Springer Nature , 2021, Vol. 12457, p. 509-525Conference paper, Published paper (Refereed)
Abstract [en]

Mixtures of Hidden Markov Models (MHMM) are widely used for clustering of sequential data, by letting each cluster correspond to a Hidden Markov Model (HMM). Expectation Maximization (EM) is the standard approach for learning the parameters of an MHMM. However, due to the non-convexity of the objective function, EM can converge to poor local optima. To tackle this problem, we propose a novel method, the Orthogonal Mixture of Hidden Markov Models (oMHMM), which aims to direct the search away from candidate solutions that include very similar HMMs, since those do not fully exploit the power of the mixture model. The directed search is achieved by including a penalty in the objective function that favors higher orthogonality between the transition matrices of the HMMs. Experimental results on both simulated and real-world datasets show that the oMHMM consistently finds equally good or better local optima than the standard EM for an MHMM; for some datasets, the clustering performance is significantly improved by our novel oMHMM (up to 55 percentage points w.r.t. the v-measure). Moreover, the oMHMM may also decrease the computational cost substantially, reducing the number of iterations down to a fifth of those required by MHMM using standard EM.

Place, publisher, year, edition, pages
Springer Nature , 2021. Vol. 12457, p. 509-525
Series
Lecture Notes in Artificial Intelligence, ISSN 0302-9743
Keywords [en]
Hidden markov models, Mixture models, Mixture of hidden markov models, Expectation maximization, Orthogonality, Regularization, Penalty
National Category
Probability Theory and Statistics
Identifiers
URN: urn:nbn:se:kth:diva-305551DOI: 10.1007/978-3-030-67658-2_29ISI: 000717522300029Scopus ID: 2-s2.0-85103236317OAI: oai:DiVA.org:kth-305551DiVA, id: diva2:1616538
Conference
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), SEP 14-18, 2020, ELECTR NETWORK
Note

QC 20211203

Conference ISBN 978-3-030-67658-2; 978-3-030-67657-5

Available from: 2021-12-03 Created: 2021-12-03 Last updated: 2022-06-25Bibliographically approved
In thesis
1. Novel likelihood-based inference techniques for sequential data with medical and biological applications
Open this publication in new window or tab >>Novel likelihood-based inference techniques for sequential data with medical and biological applications
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The probabilistic approach is crucial in modern machine learning, as it provides transparency and quantification of uncertainty. This thesis is concerned with the probabilistic building blocks, i.e., probabilistic graphical models (PGM) followed by application of standard deterministic approximate inference, i.e., Expectation-Maximization (EM) and Variational Inference (VI). The contribution regards improvements on the parameter learning of EM, most importantly, novel probabilistic models, and new VI methodology for phylogenetic inference. Firstly, this thesis improves upon the vanilla EM algorithm for hidden Markov models (HMM) and mixtures of HMMs (MHMM). The proposed constrained EM algorithm for HMMs compensates for the lack of long-range context in HMMs. The two other proposed novel regularized EM algorithms provide better local optima for parameter learning of MHMMs, particularly in cancer analysis. The novel EMs are merely modifications of the standard EM algorithm that do not add any extra complexity, unlike other modifications targeting the context and poor local optima issues. Secondly, this thesis introduces one novel and one augmented PGMs together with the VI frameworks for robust and fast Bayesian inference. The first method, CopyMix, uses a single-phase framework to simultaneously provide clonal decomposition and copy number pro- filing of single-cell cancer data. So, in contrast to previous approaches, it does not achieve the two objectives in a sequential and ad-hoc fashion, which is prune to introduce artifacts and errors. The second method provides an augmented PGM with a faster framework for phylogenetic inference; specifically, a novel natural gradient-based VI algorithm is devised. Regarding the cancer analysis, this thesis concludes that CopyMix is superior to MH- MMs, despite that the two novel EM algorithms proposed in this thesis partially improve the performance of clonal tumor decomposition. The empirical support presented throughout this thesis confirms that the proposed likelihood-based methods and optimization tools provide opportunities for better analysis algorithms, particularly suited for cancer research. 

Place, publisher, year, edition, pages
KTH Royal Institute of Technology, 2022. p. 173
Series
TRITA-EECS-AVL ; 2022:27
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:kth:diva-311350 (URN)
Public defence
2022-05-13, https://kth-se.zoom.us/j/61141675492, Sal C, Electrum, Kistagången 16, Stockholm, 13:00 (English)
Opponent
Supervisors
Note

QC 20220422

Available from: 2022-04-22 Created: 2022-04-22 Last updated: 2022-06-25Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Safinianaini, NegarBoström, HenrikLagergren, Jens

Search in DiVA

By author/editor
Safinianaini, NegarBoström, HenrikLagergren, Jens
By organisation
Software and Computer systems, SCSComputational Science and Technology (CST)Science for Life Laboratory, SciLifeLab
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 89 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf