Change search
ReferencesLink to record
Permanent link

Direct link
Structured Representation Using Latent Variable Models
KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.ORCID iD: 0000-0002-8640-9370
2016 (English)Doctoral thesis, monograph (Other academic)
Abstract [en]

Over the past two centuries the industrial revolution automated a great part of work that involved human muscles. Recently, since the beginning of the 21st century, the focus has shifted towards automating work that is involving our brain to further improve our lives. This is accomplished by establishing human-level intelligence through machines, which lead to the growth of the field of artificial intelligence. Machine learning is a core component of artificial intelligence. While artificial intelligence focuses on constructing an entire intelligence system, machine learning focuses on the learning ability and the ability to further use the learned knowledge for different tasks. This thesis targets the field of machine learning, especially structured representation learning, which is key for various machine learning approaches.

Humans sense the environment, extract information and make action decisions based on abstracted information. Similarly, machines receive data, abstract information from data through models and make decisions about the unknown through inference. Thus, models provide a mechanism for machines to abstract information. This commonly involves learning useful representations which are desirably compact, interpretable and useful for different tasks. In this thesis, the contribution relates to the design of efficient representation models with latent variables. To make the models useful, efficient inference algorithms are derived to fit the models to data. We apply our models to various applications from different domains, namely E-health, robotics, text mining, computer vision and recommendation systems.

The main contribution of this thesis relates to advancing latent variable models and deriving associated inference schemes for representation learning. This is pursued in three different directions. Firstly, through supervised models, where better representations can be learned knowing the tasks, corresponding to situated knowledge of humans. Secondly, through structured representation models, with which different structures, such as factorized ones, are used for latent variable models to form more efficient representations. Finally, through non-parametric models, where the representation is determined completely by the data. Specifically, we propose several new models combining supervised learning and factorized representation as well as a further model combining non-parametric modeling and supervised approaches. Evaluations show that these new models provide generally more efficient representations and a higher degree of interpretability.

Moreover, this thesis contributes by applying these proposed models in different practical scenarios, demonstrating that these models can provide efficient latent representations. Experimental results show that our models improve the performance for classical tasks, such as image classification and annotations, robotic scene and action understanding. Most notably, one of our models is applied to a novel problem in E-health, namely diagnostic prediction using discomfort drawings. Experimental investigation show here that our model can achieve significant results in automatic diagnosing and provides profound understanding of typical symptoms. This motivates novel decision support systems for healthcare personnel.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2016. , 245 p.
TRITA-CSC-A, ISSN 1653-5723 ; 2016:18
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Research subject
Computer Science
URN: urn:nbn:se:kth:diva-191455ISBN: 978-91-7729-080-3OAI: diva2:956549
Public defence
2016-09-26, F3, Lindstedtsvägen 26, Stockholm, 13:00 (English)
Swedish Research Council

QC 20160905

Available from: 2016-09-05 Created: 2016-08-30 Last updated: 2016-10-26Bibliographically approved

Open Access in DiVA

fulltext(43376 kB)18 downloads
File information
File name FULLTEXT01.pdfFile size 43376 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Zhang, Cheng
By organisation
Computer Vision and Active Perception, CVAP
Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 18 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 345 hits
ReferencesLink to record
Permanent link

Direct link