kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Geometry of Uncertainty: Learning Metric Spaces for Multimodal State Estimation in RL
KTH, School of Electrical Engineering and Computer Science (EECS).ORCID iD: 0000-0001-8938-9363
2026 (English)In: Geometry of Uncertainty: Learning Metric Spaces for Multimodal State Estimation in RL, 2026Conference paper, Poster (with or without abstract) (Refereed)
Abstract [en]

Estimating the state of an environment from high-dimensional, multimodal, and noisy observations is a fundamental challenge in reinforcement learning (RL). Traditional approaches rely on probabilistic models to account for the uncertainty, but often require explicit noise assumptions, in turn limiting generalization. In this work, we contribute a novel method to learn a structured latent representation, in which distances between states directly correlate with the minimum number of actions required to transition between them. The proposed metric space formulation provides a geometric interpretation of uncertainty without the need for explicit probabilistic modeling. To achieve this, we introduce a multimodal latent transition model and a sensor fusion mechanism based on inverse distance weighting, allowing for the adaptive integration of multiple sensor modalities without prior knowledge of noise distributions. We empirically validate the approach on a range of multimodal RL tasks, demonstrating improved robustness to sensor noise and superior state estimation compared to baseline methods. Our experiments show enhanced performance of an RL agent via the learned representation, eliminating the need of explicit noise augmentation. The presented results suggest that leveraging transition-aware metric spaces provides a principled and scalable solution for robust state estimation in sequential decision-making.

Place, publisher, year, edition, pages
2026.
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
URN: urn:nbn:se:kth:diva-376770OAI: oai:DiVA.org:kth-376770DiVA, id: diva2:2038891
Conference
ICLR 2026 The Fourteenth International Conference on Learning Representations, Riocentro Convention and Event Center, Rio de Janeiro, Brazil, Apr 23-27, 2026
Note

QC 20260218

Available from: 2026-02-16 Created: 2026-02-16 Last updated: 2026-02-18
In thesis
1. Interactive Representation Learning: Symmetries, Metric Spaces and Uncertainty
Open this publication in new window or tab >>Interactive Representation Learning: Symmetries, Metric Spaces and Uncertainty
2026 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

This thesis investigates how interaction can be used as self-supervision to learn structured state representations that simplify downstream tasks. We formalize two inductive biases naturally present in the trajectories generated by agents that interact with their environment: geometry and temporal consistency of the underlying state space. We show that injecting these biases into representation learning yields additional, taskrelevant properties. First, we focus on geometric bias: we learn translationally equivariant latent spaces from images in which agent actions correspond to vector additions. We show how these representations can be used to estimate a recovery policy that mitigates the compounding of error in data-driven sequential decision-making policies. We further extend equivariant representations to scenes with external objects. Under an interaction-by-contact model, we prove that aligning the object’s and the agent’s latent embeddings yields an isometric, disentangled representation of both. Second, we relax the geometry assumption and explore the milder temporal consistency bias. This allows us to construct representations where the temporal order between states is preserved, a property we refer to as distance monotonicity. In the reinforcement learning setting, we show that, under suitable conditions, this property is enough to recover an approximation of the value function and provably estimate an optimal policy. In a multiple-sensor framework, these representations can be used to construct a Bayesian filtering state estimate robust under unknown noise. Lastly, we extend the concept of interactions from physical systems to the parametric space of a learner. We show how distance monotonic representations of the parameters of a model can be used to approximate the posterior distribution of a Bayesian neural network. Finally,in a meta-learning setting, we explore implicit representations of the learner to reduce the variance of a fast-adaptation model. Collectively, these results demonstrate that interaction-driven biases produce structured representations that simplify or enhance the learning process.

Abstract [sv]

Denna avhandling undersöker hur interaktion kan användas som självövervakning för att lära strukturerade tillståndsrepresentationer som förenklar nedströmsuppgifter. Vi formaliserar två induktiva bias som naturligt uppstår i trajektorier genererade av agenter som interagerar med sin omgivning: geometri samt temporal konsistens i det underliggande tillståndsrummet. Vi visar att införandet av dessa bias i representationsinlärning ger ytterligare, uppgiftsrelevanta egenskaper. Först fokuserar vi på geometrisk bias: vi lär translationsekvivarianta latenta rum från bilder där agentens handlingar motsvarar vektoradditioner. Vi visar hur sådana representationer kan användas för att estimera en återhämtningsstrategi som dämpar felackumulering i datadrivna, sekventiella beslutspolicys. Vi utvidgar därefter ekvivarianta representationer till scener med externa objekt. Under en kontaktbaserad interaktionsmodell bevisar vi att en inriktning (alignment) av objektets och agentens latenta inbäddningar ger en isometrisk och separerad (disentangled) representation av båda. Därefter lättar vi på geometriantagandet och studerar den mildare biasen temporal konsistens. Detta möjliggör konstruktion av representationer där den temporala ordningen mellan tillstånd bevaras—en egenskap vi benämner distansmonotonicitet. I en förstärkningsinlärningsmiljö visar vi att denna egenskap, under lämpliga villkor, räcker för att återvinna en approximation av värdefunktionen och bevisligen skatta en optimal policy. I ett flersensorramverk kan dessa representationer dessutom användas för att konstruera en Bayesiansk filtreringsbaserad tillståndsskattning som är robust mot okänt brus. Slutligen utvidgar vi interaktionsbegreppet från fysikaliska system till en lärares parametriska rum. Vi visar hur distansmonotona representationer av modellparametrar kan utnyttjas för att approximera posteriordistributionen i en Bayesiansk neuronnätmodell. I en meta-inlärningssättning undersöker vi även implicita representationer av läraren för att minska variansen hos en modell för snabb anpassning. Sammantaget demonstrerar resultaten att interaktionsdrivna bias leder till strukturerade representationer som förenklar eller förbättrar inlärningsprocessen.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2026. p. xv, 57
Series
TRITA-EECS-AVL ; 2026:18
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-376773 (URN)978-91-8106-539-8 (ISBN)
Public defence
2026-03-16, https://kth-se.zoom.us/w/63788305553, F3, Lindstedtsvägen 26, Stockholm, 09:00 (English)
Opponent
Supervisors
Note

QC 20260216

Available from: 2026-02-16 Created: 2026-02-16 Last updated: 2026-02-23Bibliographically approved

Open Access in DiVA

fulltext(5087 kB)27 downloads
File information
File name FULLTEXT01.pdfFile size 5087 kBChecksum SHA-512
07382dfd8083b7b3541dc0dec34e124e02db9d7c98f4922e3c74737254adead4b6a7aec9095fadd2c89aa1faa5788ef4189b61ff76bd46a63e5cc62d190aefce
Type fulltextMimetype application/pdf

Other links

arXiv

Authority records

Reichlin, Alfredo

Search in DiVA

By author/editor
Reichlin, Alfredo
By organisation
School of Electrical Engineering and Computer Science (EECS)
Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 3788 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf