kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Breaking the Dimensionality Curse of Voronoi Tessellations
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0001-9805-0388
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Considering the broadness of the area of artificial intelligence, interpretations of the underlying methodologies can be commonly narrowed down to either a probabilistic or a geometric point of view. Such separation is especially prevalent in more classical "pre-neural-network" machine learning if one compares Bayesian modelling with more deterministic models like nearest neighbors.

The research conducted in the dissertation is based on the study and development of geometrically-founded methodologies, carried from the areas of computational topology and geometry, applied to machine learning tasks. The work is tied together with the analysis of natural data space neighborhood partitions known as Voronoi tessellations, as well as their dual Delaunay tessellations. One of the fundamental issues that arise when working with these constructs is their overwhelmingly high geometric complexity in high dimensions so abundant in modern machine learning tasks. We present a class of techniques designed to alleviate these constraints and allow us to work with Voronoi tessellations implicitly in arbitrary dimensions without their explicit construction. The techniques are based on a ray casting procedure, a widespread methodology taking root in areas of stochastic geometry and computer graphics. We present applications of such decompositions to common machine learning problems, such as classification, density estimation and active interpolation, review the methods of Delaunay graph approximation and include a more general discussion on the topic of the high-dimensional geometric analysis.

Abstract [sv]

Med hänsyn till omfattningen av området artificiell intelligens kan indelningen av de underliggande metoderna ofta begränsas till antingen en probabilistisk eller en geometrisk synvinkel. En sådan uppdelning är särskilt utbredd inom mer klassisk maskininlärning, "före-neurala-nätverk", om man jämför Bayesiansk modellering med mer deterministiska modeller som k-närmste grannar.

Den forskning som bedrivs i avhandlingen bygger på studier och utveckling av geometriskt grundade metoder, hämtade från områdena beräkningstopologi och geometri, som tillämpas på maskininlärningsuppgifter. Arbetet knyts samman med analysen av naturliga grannskapspartitioner i datarummet som kallas Voronoi-tessellationer, liksom deras dubbla Delaunay-tessellationer. Ett grundläggande problem som uppstår när man arbetar med dessa konstruktioner är deras överväldigande höga geometriska komplexitet i höga dimensioner som är väldigt vanliga i moderna maskininlärningsuppgifter. Vi presenterar en grupp liknande tekniker som är utformade för att lindra dessa begränsningar och göra det möjligt att arbeta med Voronoi-tessellationer implicit i godtyckliga dimensioner utan att de konstrueras explicit. Teknikerna är baserade på ett strålkastningsförfarande, en utbredd metodik som har fått fäste inom områden som stokastisk geometri och datorgrafik. Vi presenterar tillämpningar av sådana dekompositioner på vanliga maskininlärningsproblem, t.ex. klassificering, täthetsuppskattning och aktiv interpolering. Vi går även igenom metoderna för approximation av Delaunay-grafer och för en mer allmän diskussion om ämnet högdimensionell geometrisk analys.

Place, publisher, year, edition, pages
KTH Royal Institute of Technology, 2022. , p. 53
Series
TRITA-EECS-AVL ; 2022:62
Keywords [en]
geometric methods, machine learning methods, Voronoi, Delaunay, high dimensional geometry, curse of dimensionality, monte carlo
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-319984ISBN: 978-91-8040-376-4 (print)OAI: oai:DiVA.org:kth-319984DiVA, id: diva2:1703214
Public defence
2022-11-07, https://kth-se.zoom.us/j/69595665882, F3 Lindstedtsvagen 26, Stockholm, 15:00 (English)
Opponent
Supervisors
Note

QC 20221017

Available from: 2022-10-17 Created: 2022-10-12 Last updated: 2022-10-18Bibliographically approved
List of papers
1. Voronoi Boundary Classification: A High-Dimensional Geometric Approach via Weighted Monte Carlo Integration
Open this publication in new window or tab >>Voronoi Boundary Classification: A High-Dimensional Geometric Approach via Weighted Monte Carlo Integration
2019 (English)In: 36th International Conference on Machine Learning, ICML 2019, International Machine Learning Society (IMLS) , 2019, Vol. 2019, p. 9024-9035Conference paper, Published paper (Refereed)
Abstract [en]

Voronoi cell decompositions provide a classical avenue to classification. Typical approaches however only utilize point-wise cell-membership information by means of nearest neighbor queries and do not utilize further geometric information about Voronoi cells since the computation of Voronoi diagrams is prohibitively expensive in high dimensions. We propose a Monte-Carlo integration based approach that instead computes a weighted integral over the boundaries of Voronoi cells, thus incorporating additional information about the Voronoi cell structure. We demonstrate the scalability of our approach in up to 3072 dimensional spaces and analyze convergence based on the number of Monte Carlo samples and choice of weight functions. Experiments comparing our approach to Nearest Neighbors, SVM and Random Forests indicate that while our approach performs similarly to Random Forests for large data sizes, the algorithm exhibits non-trivial data-dependent performance characteristics for smaller datasets and can be analyzed in terms of a geometric confidence measure, thus adding to the repertoire of geometric approaches to classification while having the benefit of not requiring any model changes or retraining as new training samples or classes arc added.

Place, publisher, year, edition, pages
International Machine Learning Society (IMLS), 2019
Series
Proceedings of Machine Learning Research, ISSN 2640-3498
Keywords
Geometry, Cytology, Machine learning, Monte Carlo methods, Nearest neighbor search, Membership information, Monte Carlo integration, Nearest neighbor queries, Classification
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-268567 (URN)000684034305032 ()2-s2.0-85078240428 (Scopus ID)
Conference
36th International Conference on Machine Learning, ICML 2019, Long Beach, 9 June 2019, through 15 June 2019
Funder
Knut and Alice Wallenberg Foundation
Note

QC 20220923

Part of proceedings: ISBN 978-151088698-8

Available from: 2020-03-31 Created: 2020-03-31 Last updated: 2022-11-01Bibliographically approved
2. Voronoi Density Estimator for High-Dimensional Data: Computation, Compactification and Convergence
Open this publication in new window or tab >>Voronoi Density Estimator for High-Dimensional Data: Computation, Compactification and Convergence
Show others...
2022 (English)In: Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, PMLR , 2022, Vol. 180, p. 1644-1653Conference paper, Published paper (Refereed)
Abstract [en]

The Voronoi Density Estimator (VDE) is an established density estimation technique that adapts to the local geometry of data. However, its applicability has been so far limited to problems in two and three dimensions. This is because Voronoi cells rapidly increase in complexity as dimensions grow, making the necessary explicit computations infeasible. We define a variant of the VDE deemed Compactified Voronoi Density Estimator (CVDE), suitable for higher dimensions. We propose computationally efficient algorithms for numerical approximation of the CVDE and formally prove convergence of the estimated density to the original one. We implement and empirically validate the CVDE through a comparison with the Kernel Density Estimator (KDE). Our results indicate that the CVDE outperforms the KDE on sound and image data.

Place, publisher, year, edition, pages
PMLR, 2022
Series
Proceedings of Machine Learning Research, ISSN 2640-3498
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:kth:diva-319195 (URN)2-s2.0-85163412377 (Scopus ID)
Conference
The 38th Conference on Uncertainty in Artificial Intelligence, Eindhoven, The Netherlands, Aug 1-5 2022
Note

QC 20221003

Available from: 2022-09-28 Created: 2022-09-28 Last updated: 2024-07-23Bibliographically approved
3. Delaunay Component Analysis for Evaluation of Data Representations
Open this publication in new window or tab >>Delaunay Component Analysis for Evaluation of Data Representations
Show others...
2022 (English)In: Proceedings 10th International Conference on Learning Representations, ICLR 2022, International Conference on Learning Representations, ICLR , 2022Conference paper, Published paper (Refereed)
Abstract [en]

Advanced representation learning techniques require reliable and general evaluation methods. Recently, several algorithms based on the common idea of geometric and topological analysis of a manifold approximated from the learned data representations have been proposed. In this work, we introduce Delaunay Component Analysis (DCA) - an evaluation algorithm which approximates the data manifold using a more suitable neighbourhood graph called Delaunay graph. This provides a reliable manifold estimation even for challenging geometric arrangements of representations such as clusters with varying shape and density as well as outliers, which is where existing methods often fail. Furthermore, we exploit the nature of Delaunay graphs and introduce a framework for assessing the quality of individual novel data representations. We experimentally validate the proposed DCA method on representations obtained from neural networks trained with contrastive objective, supervised and generative models, and demonstrate various use cases of our extended single point evaluation framework.

Place, publisher, year, edition, pages
International Conference on Learning Representations, ICLR, 2022
Keywords
Representation Learning, Machine Learning
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-312715 (URN)2-s2.0-85124640294 (Scopus ID)
Conference
10th International Conference on Learning Representations, ICLR 2022, Apr 25-29, 2022 (online)
Note

QC 20220614

Available from: 2022-05-20 Created: 2022-05-20 Last updated: 2023-09-07Bibliographically approved
4. Voronoi Graph Traversal in High Dimensions with Applications to Topological Data Analysis and Piecewise Linear Interpolation
Open this publication in new window or tab >>Voronoi Graph Traversal in High Dimensions with Applications to Topological Data Analysis and Piecewise Linear Interpolation
2020 (English)In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Association for Computing Machinery (ACM) , 2020, p. 2154-2164Conference paper, Published paper (Refereed)
Abstract [en]

Voronoi diagrams and their dual, the Delaunay complex, are two fundamental geometric concepts that lie at the foundation of many machine learning algorithms and play a role in particular in classical piecewise linear interpolation and regression methods. More recently, they are also crucial for the construction of a common class of simplicial complexes such as Alpha and Delaunay-\vC ech complexes in topological data analysis. We propose a randomized approximation approach that mitigates the prohibitive cost of exact computation of Voronoi diagrams in high dimensions for machine learning applications. In experiments with data in up to 50 dimensions, we show that this allows us to significantly extend the use of Voronoi-based simplicial complexes in Topological Data Analysis (TDA) to higher dimensions. We confirm prior TDA results on image patches that previously had to rely on sub-sampled data with increased resolution and demonstrate the scalability of our approach by performing a TDA analysis on synthetic data as well as on filters of a ResNet neural network architecture. Secondly, we propose an application of our approach to piecewise linear interpolation of high dimensional data that avoids explicit complete computation of an associated Delaunay triangulation. 

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2020
Keywords
Delaunay complex, linear interpolation, topological data analysis, Voronoi diagram, Clustering algorithms, Complex networks, Computational geometry, Filtration, Graphic methods, Interpolation, Learning algorithms, Machine learning, Network architecture, Piecewise linear techniques, Regression analysis, Topology, Delau-nay triangulations, Exact computations, High dimensional data, Machine learning applications, Piecewise linear interpolations, Randomized approximation, Simplicial complex, Data mining
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-286484 (URN)10.1145/3394486.3403266 (DOI)000749552302015 ()2-s2.0-85090422948 (Scopus ID)
Conference
26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2020, 23 August 2020 through 27 August 2020
Note

QC 20201217

Part of proceedings: ISBN 978-1-4503-7998-4

Available from: 2020-12-17 Created: 2020-12-17 Last updated: 2022-11-01Bibliographically approved
5. Active Nearest Neighbor Regression Through Delaunay Refinement
Open this publication in new window or tab >>Active Nearest Neighbor Regression Through Delaunay Refinement
Show others...
2022 (English)In: Proceedings of the 39th International Conference on Machine Learning, MLResearch Press , 2022, Vol. 162, p. 11650-11664Conference paper, Published paper (Refereed)
Abstract [en]

We introduce an algorithm for active function approximation based on nearest neighbor regression. Our Active Nearest Neighbor Regressor (ANNR) relies on the Voronoi-Delaunay framework from computational geometry to subdivide the space into cells with constant estimated function value and select novel query points in a way that takes the geometry of the function graph into account. We consider the recent state-of-the-art active function approximator called DEFER, which is based on incremental rectangular partitioning of the space, as the main baseline. The ANNR addresses a number of limitations that arise from the space subdivision strategy used in DEFER. We provide a computationally efficient implementation of our method, as well as theoretical halting guarantees. Empirical results show that ANNR outperforms the baseline for both closed-form functions and real-world examples, such as gravitational wave parameter inference and exploration of the latent space of a generative model.

Place, publisher, year, edition, pages
MLResearch Press, 2022
Series
Proceedings of Machine Learning Research, ISSN 2640-3498 ; 162
National Category
Computer Sciences Control Engineering
Identifiers
urn:nbn:se:kth:diva-319194 (URN)000900064901033 ()2-s2.0-85163127180 (Scopus ID)
Conference
39th International Conference on Machine Learning, Baltimore, Maryland, USA, PMLR 162, 17-23 July, 2022
Note

QC 20230509

Available from: 2022-09-28 Created: 2022-09-28 Last updated: 2024-03-02Bibliographically approved

Open Access in DiVA

fulltext(4249 kB)1770 downloads
File information
File name FULLTEXT01.pdfFile size 4249 kBChecksum SHA-512
34a83645e361ba90f4ec6eb79ceed288cd3e658a14ede91abccf4b2ae61f952c4fdc82baf3422dea51fc6292ad7641f597b273641b2e6e1cadaa3847f8ea5e6c
Type fulltextMimetype application/pdf

Authority records

Polianskii, Vladislav

Search in DiVA

By author/editor
Polianskii, Vladislav
By organisation
Robotics, Perception and Learning, RPL
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 1770 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 2158 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf