kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Cloth-Splatting: 3D Cloth State Estimation from RGB Supervision
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0001-9125-6615
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0001-9296-9166
Carnegie Mellon University, Pittsburgh, PA, US.ORCID iD: 0000-0002-2797-4898
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0003-2296-6685
Show others and affiliations
2024 (English)In: Proceedings of the 8th Conference on Robot Learning, CoRL 2024, ML Research Press , 2024, p. 2845-2865Conference paper, Published paper (Refereed)
Abstract [en]

We introduce Cloth-Splatting, a method for estimating 3D states of cloth from RGB images through a prediction-update framework. Cloth-Splatting leverages an action-conditioned dynamics model for predicting future states and uses 3D Gaussian Splatting to update the predicted states. Our key insight is that coupling a 3D mesh-based representation with Gaussian Splatting allows us to define a differentiable map between the cloth's state space and the image space. This enables the use of gradient-based optimization techniques to refine inaccurate state estimates using only RGB supervision. Our experiments demonstrate that Cloth-Splatting not only improves state estimation accuracy over current baselines but also reduces convergence time by ∼85 %.

Place, publisher, year, edition, pages
ML Research Press , 2024. p. 2845-2865
Keywords [en]
3D State Estimation, Gaussian Splatting, Vision-based Tracking, Deformable Objects
National Category
Computer graphics and computer vision
Identifiers
URN: urn:nbn:se:kth:diva-357192Scopus ID: 2-s2.0-86000735293OAI: oai:DiVA.org:kth-357192DiVA, id: diva2:1918273
Conference
8th Annual Conference on Robot Learning, November 6-9, 2024, Munich, Germany
Note

QC 20250328

Available from: 2024-12-04 Created: 2024-12-04 Last updated: 2025-03-28Bibliographically approved
In thesis
1. Adapting to Variations in Textile Properties for Robotic Manipulation
Open this publication in new window or tab >>Adapting to Variations in Textile Properties for Robotic Manipulation
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

In spite of the rapid advancements in AI, tasks like laundry, tidying, and general household assistance remain challenging for robots due to their limited capacity to generalize manipulation skills across different variations of everyday objects.Manipulation of textiles, in particular, poses unique challenges due to their deformable nature and complex dynamics.  In this thesis, we aim to enhance the generalization of robotic manipulation skills for textiles by addressing how robots can adapt their strategies based on the physical properties of deformable objects. We begin by identifying key factors of variation in textiles relevant to manipulation, drawing insights from overlooked taxonomies in the textile industry. The core challenge of adaptation is addressed by leveraging the synergies between interactive perception and cloth dynamics models. These are utilized to tackle two fundamental estimation problems to achieve adaptation: property identification, as these properties define the system’s dynamic and how the object responds to external forces, and state estimation, which provides the feedback necessary for closing the action-perception loop.  To identify object properties, we investigate how combining exploratory actions, such as pulling and twisting, with sensory feedback can enhance a robot’s understanding of textile characteristics. Central to this investigation is the development of an adaptation module designed to encode textile properties from recent observations, enabling data-driven dynamics models to adjust their predictions accordingly to the perceived properties. To address state estimation challenges arising from cloth self-occlusions, we explore semantic descriptors and 3D tracking methods that integrate geometric observations, such as point clouds, with visual cues from RGB data.Finally, we integrate these modeling and perceptual components into a model-based manipulation framework and evaluate the generalization of the proposed method across a diverse set of real-world textiles. The results, demonstrating enhanced generalization, underscore the potential of adapting the manipulation in response to variations in textiles' properties and highlight the critical role of the action-perception loop in achieving adaptability.

Abstract [sv]

Trots de snabba framstegen inom AI förblir uppgifter som att tvätta, städa och allmän hushållshjälp utmanande för robotar på grund av deras begränsade förmåga att generalisera manipulationsfärdigheter över olika variationer av vardagsföremål. Manipulation av textilier utgör i synnerhet unika utmaningar på grund av deras deformerbara natur och komplexa dynamik.I denna avhandling syftar vi till att förbättra generaliseringen av robotiska manipulationsfärdigheter för textilier genom att undersöka hur robotar kan anpassa sina strategier baserat på de fysiska egenskaperna hos deformerbara objekt. Vi börjar med att identifiera nyckelfaktorer för variation i textilier som är relevanta för manipulation och drar insikter från förbisedda taxonomier inom textilindustrin.Den centrala utmaningen med anpassning adresseras genom att utnyttja synergierna mellan interaktiv perception och modeller för textildynamik. Dessa används för att lösa två grundläggande estimeringsproblem för att uppnå anpassning: egenskapsidentifiering, eftersom dessa egenskaper definierar systemets dynamik och hur objektet reagerar på yttre krafter, samt tillståndsestimering, som ger den återkoppling som krävs för att stänga åtgärds-perceptionsslingan. För att identifiera objektets egenskaper undersöker vi hur kombinationen av utforskande handlingar, såsom att dra och vrida, med sensorisk återkoppling kan förbättra robotens förståelse för textilens egenskaper. Centralt i denna undersökning är utvecklingen av en anpassningsmodul utformad för att koda textilens egenskaper från nyligen gjorda observationer, vilket gör det möjligt för datadrivna dynamikmodeller att justera sina förutsägelser utifrån de uppfattade egenskaperna.För att hantera utmaningar med tillståndsestimering som uppstår vid textilens självocklusioner utforskar vi semantiska deskriptorer och 3D-spårningsmetoder som integrerar geometriska observationer, såsom punktmoln, med visuella ledtrådar från RGB-data.Slutligen integrerar vi dessa modellerings- och perceptionskomponenter i ett modellbaserat manipulationsramverk och utvärderar generaliseringen av den föreslagna metoden på ett brett urval av textilier i verkliga miljöer. Resultaten, som visar förbättrad generalisering, understryker potentialen i att anpassa manipulation till variationer i textilernas egenskaper och framhäver den avgörande rollen för åtgärds-perceptionsslingan i att uppnå anpassningsförmåga.

Place, publisher, year, edition, pages
KTH Royal Institute of Technology, 2025. p. 82
Series
TRITA-EECS-AVL ; 2025:1
Keywords
Textile Variations, Robotic Manipulation, Generalization, Adaptation, Textila Variationer, Robotmanipulation, Generalisering, Anpassning
National Category
Computer graphics and computer vision Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-357508 (URN)978-91-8106-125-3 (ISBN)
Public defence
2025-01-14, https://kth-se.zoom.us/j/66979575369, F3 (Flodis), Lindstedtsvägen 26 & 28, KTH Campus, Stockholm, 13:00 (English)
Opponent
Supervisors
Note

QC 20241213

Available from: 2024-12-13 Created: 2024-12-12 Last updated: 2025-04-01Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Scopusfulltext

Authority records

Longhini, AlbertaBüsching, MarcelLundell, JensBjörkman, MårtenKragic, Danica

Search in DiVA

By author/editor
Longhini, AlbertaBüsching, MarcelDuisterhof, Bardienus PieterLundell, JensIchnowski, JeffreyBjörkman, MårtenKragic, Danica
By organisation
Robotics, Perception and Learning, RPL
Computer graphics and computer vision

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 97 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf