kth.sePublikationer KTH
Ändra sökning
Länk till posten
Permanent länk

Direktlänk
Alternativa namn
Publikationer (10 of 133) Visa alla publikationer
Li, C., Yang, Y., Weng, Z., Hernlund, E., Zuffi, S. & Kjellström, H. (2025). Dessie: Disentanglement for Articulated 3D Horse Shape and Pose Estimation from Images. In: Computer Vision – ACCV 2024 - 17th Asian Conference on Computer Vision, Proceedings: . Paper presented at 17th Asian Conference on Computer Vision, ACCV 2024, Hanoi, Viet Nam, Dec 8 2024 - Dec 12 2024 (pp. 268-288). Springer Science and Business Media Deutschland GmbH
Öppna denna publikation i ny flik eller fönster >>Dessie: Disentanglement for Articulated 3D Horse Shape and Pose Estimation from Images
Visa övriga...
2025 (Engelska)Ingår i: Computer Vision – ACCV 2024 - 17th Asian Conference on Computer Vision, Proceedings, Springer Science and Business Media Deutschland GmbH , 2025, s. 268-288Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

In recent years, 3D parametric animal models have been developed to aid in estimating 3D shape and pose from images and video. While progress has been made for humans, it’s more challenging for animals due to limited annotated data. To address this, we introduce the first method using synthetic data generation and disentanglement to learn to regress 3D shape and pose. Focusing on horses, we use text-based texture generation and a synthetic data pipeline to create varied shapes, poses, and appearances, learning disentangled spaces. Our method, Dessie, surpasses existing 3D horse reconstruction methods and generalizes to other large animals like zebras, cows, and deer. See the project website at: https://celiali.github.io/Dessie/.

Ort, förlag, år, upplaga, sidor
Springer Science and Business Media Deutschland GmbH, 2025
Nyckelord
Animal 3D reconstruction, disentanglement
Nationell ämneskategori
Datorgrafik och datorseende Bioinformatik (beräkningsbiologi)
Identifikatorer
urn:nbn:se:kth:diva-358262 (URN)10.1007/978-981-96-0972-7_16 (DOI)001542340100016 ()2-s2.0-85213389101 (Scopus ID)
Konferens
17th Asian Conference on Computer Vision, ACCV 2024, Hanoi, Viet Nam, Dec 8 2024 - Dec 12 2024
Anmärkning

Part of ISBN 9789819609710]

QC 20250113

Tillgänglig från: 2025-01-08 Skapad: 2025-01-08 Senast uppdaterad: 2025-12-08Bibliografiskt granskad
Liu, C., Morse, B., Sminchisescu, C., Isola, P., Kjellström, H., Lepetit, V., . . . Tang, S. (2025). Message from the General and Program Chairs CVPR 2025. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition: . Paper presented at 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, June 10-17, 2025 (pp. ccclxxxii-ccclxxxiii). Institute of Electrical and Electronics Engineers (IEEE)
Öppna denna publikation i ny flik eller fönster >>Message from the General and Program Chairs CVPR 2025
Visa övriga...
2025 (Engelska)Ingår i: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Institute of Electrical and Electronics Engineers (IEEE) , 2025, s. ccclxxxii-ccclxxxiiiKonferensbidrag, Publicerat paper (Övrigt vetenskapligt)
Ort, förlag, år, upplaga, sidor
Institute of Electrical and Electronics Engineers (IEEE), 2025
Nationell ämneskategori
Datorsystem
Identifikatorer
urn:nbn:se:kth:diva-371614 (URN)10.1109/CVPR52734.2025.00005 (DOI)2-s2.0-105017030358 (Scopus ID)
Konferens
2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, June 10-17, 2025
Anmärkning

Part of ISBN 9798331543648, 9798331543655

QC 20251014

Tillgänglig från: 2025-10-14 Skapad: 2025-10-14 Senast uppdaterad: 2025-10-14Bibliografiskt granskad
Mårtensson, G., Åden, U., Örtqvist, M., Kjellström, H. & Tibbe, M. (2025). Video Analysis of Infant Spontaneous Movements to Predict Neuroedevelopmental Deficiency. Acta Paediatrica, 114, 333-333
Öppna denna publikation i ny flik eller fönster >>Video Analysis of Infant Spontaneous Movements to Predict Neuroedevelopmental Deficiency
Visa övriga...
2025 (Engelska)Ingår i: Acta Paediatrica, ISSN 0803-5253, E-ISSN 1651-2227, Vol. 114, s. 333-333Artikel i tidskrift, Meeting abstract (Övrigt vetenskapligt) Published
Ort, förlag, år, upplaga, sidor
Wiley, 2025
Nationell ämneskategori
Pediatrik
Identifikatorer
urn:nbn:se:kth:diva-376249 (URN)001533675303047 ()
Anmärkning

QC 20260204

Tillgänglig från: 2026-02-04 Skapad: 2026-02-04 Senast uppdaterad: 2026-02-04Bibliografiskt granskad
Zuffi, S., Mellbin, Y., Li, C., Hoeschle, M., Kjellström, H., Polikovsky, S., . . . Black, M. J. (2024). VAREN: Very Accurate and Realistic Equine Network. In: 2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024: . Paper presented at IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), JUN 16-22, 2024, Seattle, WA (pp. 5374-5383). Institute of Electrical and Electronics Engineers (IEEE)
Öppna denna publikation i ny flik eller fönster >>VAREN: Very Accurate and Realistic Equine Network
Visa övriga...
2024 (Engelska)Ingår i: 2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, s. 5374-5383Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Data-driven three-dimensional parametric shape models of the human body have gained enormous popularity both for the analysis of visual data and for the generation of synthetic humans. Following a similar approach for animals does not scale to the multitude of existing animal species, not to mention the difficulty of accessing subjects to scan in 3D. However, we argue that for domestic species of great importance, like the horse, it is a highly valuable investment to put effort into gathering a large dataset of real 3D scans, and learn a realistic 3D articulated shape model. We introduce VAREN, a novel 3D articulated parametric shape model learned from 3D scans of many real horses. VAREN bridges synthesis and analysis tasks, as the generated model instances have unprecedented realism, while being able to represent horses of different sizes and shapes. Differently from previous body models, VAREN has two resolutions, an anatomical skeleton, and interpretable, learned pose-dependent deformations, which are related to the body muscles. We show with experiments that this formulation has superior performance with respect to previous strategies for modeling pose-dependent deformations in the human body case, while also being more compact and allowing an analysis of the relationship between articulation and muscle deformation during articulated motion. The VAREN model and data are available at https://varen.is.tue.mpg.de.

Ort, förlag, år, upplaga, sidor
Institute of Electrical and Electronics Engineers (IEEE), 2024
Serie
IEEE Conference on Computer Vision and Pattern Recognition, ISSN 1063-6919
Nationell ämneskategori
Data- och informationsvetenskap
Identifikatorer
urn:nbn:se:kth:diva-358703 (URN)10.1109/CVPR52733.2024.00514 (DOI)001322555905073 ()2-s2.0-85198274738 (Scopus ID)
Konferens
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), JUN 16-22, 2024, Seattle, WA
Anmärkning

Part of ISBN 979-8-3503-5301-3; 979-8-3503-5300-6

QC 20250120

Tillgänglig från: 2025-01-20 Skapad: 2025-01-20 Senast uppdaterad: 2025-01-20Bibliografiskt granskad
Yin, W., Tu, R., Yin, H., Kragic, D., Kjellström, H. & Björkman, M. (2023). Controllable Motion Synthesis and Reconstruction with Autoregressive Diffusion Models. In: 2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN: . Paper presented at 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), AUG 28-31, 2023, Busan, SOUTH KOREA (pp. 1102-1108). Institute of Electrical and Electronics Engineers (IEEE)
Öppna denna publikation i ny flik eller fönster >>Controllable Motion Synthesis and Reconstruction with Autoregressive Diffusion Models
Visa övriga...
2023 (Engelska)Ingår i: 2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN, Institute of Electrical and Electronics Engineers (IEEE) , 2023, s. 1102-1108Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Data-driven and controllable human motion synthesis and prediction are active research areas with various applications in interactive media and social robotics. Challenges remain in these fields for generating diverse motions given past observations and dealing with imperfect poses. This paper introduces MoDiff, an autoregressive probabilistic diffusion model over motion sequences conditioned on control contexts of other modalities. Our model integrates a cross-modal Transformer encoder and a Transformer-based decoder, which are found effective in capturing temporal correlations in motion and control modalities. We also introduce a new data dropout method based on the diffusion forward process to provide richer data representations and robust generation. We demonstrate the superior performance of MoDiff in controllable motion synthesis for locomotion with respect to two baselines and show the benefits of diffusion data dropout for robust synthesis and reconstruction of high-fidelity motion close to recorded data.

Ort, förlag, år, upplaga, sidor
Institute of Electrical and Electronics Engineers (IEEE), 2023
Serie
IEEE RO-MAN, ISSN 1944-9445
Nationell ämneskategori
Datorgrafik och datorseende
Identifikatorer
urn:nbn:se:kth:diva-341978 (URN)10.1109/RO-MAN57019.2023.10309317 (DOI)001108678600131 ()2-s2.0-85186990309 (Scopus ID)
Konferens
32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), AUG 28-31, 2023, Busan, SOUTH KOREA
Anmärkning

Part of proceedings ISBN 979-8-3503-3670-2

QC 20240110

Tillgänglig från: 2024-01-10 Skapad: 2024-01-10 Senast uppdaterad: 2025-02-07Bibliografiskt granskad
Broomé, S., Feighelstein, M., Zamansky, A., Carreira Lencioni, G., Haubro Andersen, P., Pessanha, F., . . . Salah, A. A. (2023). Going Deeper than Tracking: A Survey of Computer-Vision Based Recognition of Animal Pain and Emotions. International Journal of Computer Vision, 131(2), 572-590
Öppna denna publikation i ny flik eller fönster >>Going Deeper than Tracking: A Survey of Computer-Vision Based Recognition of Animal Pain and Emotions
Visa övriga...
2023 (Engelska)Ingår i: International Journal of Computer Vision, ISSN 0920-5691, E-ISSN 1573-1405, Vol. 131, nr 2, s. 572-590Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

Advances in animal motion tracking and pose recognition have been a game changer in the study of animal behavior. Recently, an increasing number of works go ‘deeper’ than tracking, and address automated recognition of animals’ internal states such as emotions and pain with the aim of improving animal welfare, making this a timely moment for a systematization of the field. This paper provides a comprehensive survey of computer vision-based research on recognition of pain and emotional states in animals, addressing both facial and bodily behavior analysis. We summarize the efforts that have been presented so far within this topic—classifying them across different dimensions, highlight challenges and research gaps, and provide best practice recommendations for advancing the field, and some future directions for research. 

Ort, förlag, år, upplaga, sidor
Springer Nature, 2023
Nyckelord
Affective computing, Computer vision for animals, Emotion recognition, Non-human behavior analysis, Pain estimation, Pain recognition, Animals, Behavioral research, Health, Surveys, Animal motion, Computer vision for animal, Human behavior analysis, Motion tracking, Non-human behavior analyse, Vision based, Computer vision
Nationell ämneskategori
Datorgrafik och datorseende
Identifikatorer
urn:nbn:se:kth:diva-328890 (URN)10.1007/s11263-022-01716-3 (DOI)000888708500001 ()2-s2.0-85142475831 (Scopus ID)
Anmärkning

QC 20230613

Tillgänglig från: 2023-06-13 Skapad: 2023-06-13 Senast uppdaterad: 2025-02-07Bibliografiskt granskad
Lawin, F. J., Bystrom, A., Roepstorff, C., Rhodin, M., Almloef, M., Silva, M., . . . Hernlund, E. (2023). Is Markerless More or Less?: Comparing a Smartphone Computer Vision Method for Equine Lameness Assessment to Multi-Camera Motion Capture. Animals, 13(3), Article ID 390.
Öppna denna publikation i ny flik eller fönster >>Is Markerless More or Less?: Comparing a Smartphone Computer Vision Method for Equine Lameness Assessment to Multi-Camera Motion Capture
Visa övriga...
2023 (Engelska)Ingår i: Animals, E-ISSN 2076-2615, Vol. 13, nr 3, artikel-id 390Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

Lameness, an alteration of the gait due to pain or dysfunction of the locomotor system, is the most common disease symptom in horses. Yet, it is difficult for veterinarians to correctly assess by visual inspection. Objective tools that can aid clinical decision making and provide early disease detection through sensitive lameness measurements are needed. In this study, we describe how an AI-powered measurement tool on a smartphone can detect lameness in horses without the need to mount equipment on the horse. We compare it to a state-of-the-art multi-camera motion capture system by simultaneous, synchronised recordings from both systems. The mean difference between the systems' output of lameness metrics was below 2.2 mm. Therefore, we conclude that the smartphone measurement tool can detect lameness at relevant levels with easy-of-use for the veterinarian. Computer vision is a subcategory of artificial intelligence focused on extraction of information from images and video. It provides a compelling new means for objective orthopaedic gait assessment in horses using accessible hardware, such as a smartphone, for markerless motion analysis. This study aimed to explore the lameness assessment capacity of a smartphone single camera (SC) markerless computer vision application by comparing measurements of the vertical motion of the head and pelvis to an optical motion capture multi-camera (MC) system using skin attached reflective markers. Twenty-five horses were recorded with a smartphone (60 Hz) and a 13 camera MC-system (200 Hz) while trotting two times back and forth on a 30 m runway. The smartphone video was processed using artificial neural networks detecting the horse's direction, action and motion of body segments. After filtering, the vertical displacement curves from the head and pelvis were synchronised between systems using cross-correlation. This rendered 655 and 404 matching stride segmented curves for the head and pelvis respectively. From the stride segmented vertical displacement signals, differences between the two minima (MinDiff) and the two maxima (MaxDiff) respectively per stride were compared between the systems. Trial mean difference between systems was 2.2 mm (range 0.0-8.7 mm) for head and 2.2 mm (range 0.0-6.5 mm) for pelvis. Within-trial standard deviations ranged between 3.1-28.1 mm for MC and between 3.6-26.2 mm for SC. The ease of use and good agreement with MC indicate that the SC application is a promising tool for detecting clinically relevant levels of asymmetry in horses, enabling frequent and convenient gait monitoring over time.

Ort, förlag, år, upplaga, sidor
MDPI AG, 2023
Nyckelord
monocular motion analysis, objective lameness assessment, equine orthopaedics, animal pose estimation, optical motion capture
Nationell ämneskategori
Klinisk vetenskap
Identifikatorer
urn:nbn:se:kth:diva-324900 (URN)10.3390/ani13030390 (DOI)000931327100001 ()36766279 (PubMedID)2-s2.0-85147826149 (Scopus ID)
Anmärkning

QC 20230321

Tillgänglig från: 2023-03-21 Skapad: 2023-03-21 Senast uppdaterad: 2024-01-17Bibliografiskt granskad
Klasson, M., Kjellström, H. & Zhang, C. (2023). Learn the Time to Learn: Replay Scheduling in Continual Learning. Transactions on Machine Learning Research, 2023-November
Öppna denna publikation i ny flik eller fönster >>Learn the Time to Learn: Replay Scheduling in Continual Learning
2023 (Engelska)Ingår i: Transactions on Machine Learning Research, E-ISSN 2835-8856, Vol. 2023-NovemberArtikel i tidskrift (Refereegranskat) Published
Abstract [en]

Replay methods are known to be successful at mitigating catastrophic forgetting in continual learning scenarios despite having limited access to historical data. However, storing historical data is cheap in many real-world settings, yet replaying all historical data is often prohibited due to processing time constraints. In such settings, we propose that continual learning systems should learn the time to learn and schedule which tasks to replay at different time steps. We first demonstrate the benefits of our proposal by using Monte Carlo tree search to find a proper replay schedule, and show that the found replay schedules can outperform fixed scheduling policies when combined with various replay methods in different continual learning settings. Additionally, we propose a framework for learning replay scheduling policies with reinforcement learning. We show that the learned policies can gen-eralize better in new continual learning scenarios compared to equally replaying all seen tasks, without added computational cost. Our study reveals the importance of learning the time to learn in continual learning, which brings current research closer to real-world needs.

Ort, förlag, år, upplaga, sidor
Transactions on Machine Learning Research, 2023
Nationell ämneskategori
Datorgrafik och datorseende Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:kth:diva-361775 (URN)2-s2.0-86000627640 (Scopus ID)
Anmärkning

QC 20250328

Tillgänglig från: 2025-03-27 Skapad: 2025-03-27 Senast uppdaterad: 2025-03-28Bibliografiskt granskad
Christoffersen, B., Mahjani, B., Clements, M., Kjellström, H. & Humphreys, K. (2023). Quasi-Monte Carlo Methods for Binary Event Models with Complex Family Data. Journal of Computational And Graphical Statistics, 32(4), 1393-1401
Öppna denna publikation i ny flik eller fönster >>Quasi-Monte Carlo Methods for Binary Event Models with Complex Family Data
Visa övriga...
2023 (Engelska)Ingår i: Journal of Computational And Graphical Statistics, ISSN 1061-8600, E-ISSN 1537-2715, Vol. 32, nr 4, s. 1393-1401Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

The generalized linear mixed model for binary outcomes with the probit link function is used in many fields but has a computationally challenging likelihood when there are many random effects. We extend a previously used importance sampler, making it much faster in the context of estimating heritability and related effects from family data by adding a gradient and a Hessian approximation and making a faster implementation. Additionally, a graph-based method is suggested to simplify the likelihood when there are thousands of individuals in each family. Simulation studies show that the resulting method is orders of magnitude faster, has a negligible efficiency loss, and confidence intervals with nominal coverage. We also analyze data from a large study of obsessive-compulsive disorder based on Swedish multi-generational data. In this analysis, the proposed method yielded similar results to a previous analysis, but was much faster. Supplementary materials for this article are available online.

Ort, förlag, år, upplaga, sidor
Informa UK Limited, 2023
Nyckelord
Family-based studies, Generalized linear mixed model, Importance sampling
Nationell ämneskategori
Sannolikhetsteori och statistik
Identifikatorer
urn:nbn:se:kth:diva-350088 (URN)10.1080/10618600.2022.2151454 (DOI)000911289700001 ()2-s2.0-85146716373 (Scopus ID)
Anmärkning

QC 20240807

Tillgänglig från: 2024-08-07 Skapad: 2024-08-07 Senast uppdaterad: 2024-08-07Bibliografiskt granskad
Broomé, S., Pokropek, E., Li, B. & Kjellström, H. (2023). Recur, Attend or Convolve?: On Whether Temporal Modeling Matters for Cross-Domain Robustness in Action Recognition. In: 2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV): . Paper presented at 23rd IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), JAN 03-07, 2023, Waikoloa, HI (pp. 4188-4198). Institute of Electrical and Electronics Engineers (IEEE)
Öppna denna publikation i ny flik eller fönster >>Recur, Attend or Convolve?: On Whether Temporal Modeling Matters for Cross-Domain Robustness in Action Recognition
2023 (Engelska)Ingår i: 2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), Institute of Electrical and Electronics Engineers (IEEE) , 2023, s. 4188-4198Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Most action recognition models today are highly parameterized, and evaluated on datasets with appearance-wise distinct classes. It has also been shown that 2D Convolutional Neural Networks (CNNs) tend to be biased toward texture rather than shape in still image recognition tasks [19], in contrast to humans. Taken together, this raises suspicion that large video models partly learn spurious spatial texture correlations rather than to track relevant shapes over time to infer generalizable semantics from their movement. A natural way to avoid parameter explosion when learning visual patterns over time is to make use of recurrence. Biological vision consists of abundant recurrent circuitry, and is superior to computer vision in terms of domain shift generalization. In this article, we empirically study whether the choice of low-level temporal modeling has consequences for texture bias and cross-domain robustness. In order to enable a light-weight and systematic assessment of the ability to capture temporal structure, not revealed from single frames, we provide the Temporal Shape (TS) dataset, as well as modified domains of Diving48 allowing for the investigation of spatial texture bias in video models. The combined results of our experiments indicate that sound physical inductive bias such as recurrence in temporal modeling may be advantageous when robustness to domain shift is important for the task.

Ort, förlag, år, upplaga, sidor
Institute of Electrical and Electronics Engineers (IEEE), 2023
Serie
IEEE Winter Conference on Applications of Computer Vision, ISSN 2472-6737
Nationell ämneskategori
Datorgrafik och datorseende
Identifikatorer
urn:nbn:se:kth:diva-333276 (URN)10.1109/WACV56688.2023.00418 (DOI)000971500204030 ()2-s2.0-85149047006 (Scopus ID)
Konferens
23rd IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), JAN 03-07, 2023, Waikoloa, HI
Anmärkning

QC 20230731

Tillgänglig från: 2023-07-31 Skapad: 2023-07-31 Senast uppdaterad: 2025-02-07Bibliografiskt granskad
Organisationer
Identifikatorer
ORCID-id: ORCID iD iconorcid.org/0000-0002-5750-9655

Sök vidare i DiVA

Visa alla publikationer