kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Asymptotically Exact and Fast Gaussian Copula Models for Imputation of Mixed Data Types
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0002-7182-1346
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0002-5750-9655
2021 (English)In: Proceedings of Machine Learning Research, ML Research Press , 2021, p. 870-885Conference paper, Published paper (Refereed)
Abstract [en]

Missing values with mixed data types is a common problem in a large number of machine learning applications such as processing of surveys and in different medical applications. Recently, Gaussian copula models have been suggested as a means of performing imputation of missing values using a probabilistic framework. While the present Gaussian copula models have shown to yield state of the art performance, they have two limitations: they are based on an approximation that is fast but may be imprecise and they do not support unordered multinomial variables. We address the first limitation using direct and arbitrarily precise approximations both for model estimation and imputation by using randomized quasi-Monte Carlo procedures. The method we provide has lower errors for the estimated model parameters and the imputed values, compared to previously proposed methods. We also extend the previous Gaussian copula models to include unordered multinomial variables in addition to the present support of ordinal, binary, and continuous variables. 

Place, publisher, year, edition, pages
ML Research Press , 2021. p. 870-885
Keywords [en]
Gaussian Copulas, Imputation, Quasi-Monte Carlo, Gaussian distribution, Monte Carlo methods, Gaussian copula, Gaussian copula models, Imputation of missing values, Machine learning applications, Missing values, Mixed data types, Multinomials, Probabilistic framework, Medical applications
National Category
Probability Theory and Statistics
Identifiers
URN: urn:nbn:se:kth:diva-328826Scopus ID: 2-s2.0-85140424576OAI: oai:DiVA.org:kth-328826DiVA, id: diva2:1766595
Conference
Conference on Machine Learning, ACML 2021
Note

QC 20230613

Available from: 2023-06-13 Created: 2023-06-13 Last updated: 2023-06-13Bibliographically approved

Open Access in DiVA

No full text in DiVA

Scopus

Authority records

Christoffersen, BenjaminKjellström, Hedvig

Search in DiVA

By author/editor
Christoffersen, BenjaminKjellström, Hedvig
By organisation
Robotics, Perception and Learning, RPL
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 33 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf