kth.sePublications KTH
Change search
Link to record
Permanent link

Direct link
Alternative names
Publications (10 of 20) Show all publications
Akhavan Rahnama, A. H., Bütepage, J., Geurts, P. & Boström, H. (2024). Can local explanation techniques explain linear additive models?. Data mining and knowledge discovery, 38(1), 237-280
Open this publication in new window or tab >>Can local explanation techniques explain linear additive models?
2024 (English)In: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 38, no 1, p. 237-280Article in journal (Refereed) Published
Abstract [en]

Local model-agnostic additive explanation techniques decompose the predicted output of a black-box model into additive feature importance scores. Questions have been raised about the accuracy of the produced local additive explanations. We investigate this by studying whether some of the most popular explanation techniques can accurately explain the decisions of linear additive models. We show that even though the explanations generated by these techniques are linear additives, they can fail to provide accurate explanations when explaining linear additive models. In the experiments, we measure the accuracy of additive explanations, as produced by, e.g., LIME and SHAP, along with the non-additive explanations of Local Permutation Importance (LPI) when explaining Linear and Logistic Regression and Gaussian naive Bayes models over 40 tabular datasets. We also investigate the degree to which different factors, such as the number of numerical or categorical or correlated features, the predictive performance of the black-box model, explanation sample size, similarity metric, and the pre-processing technique used on the dataset can directly affect the accuracy of local explanations.

Place, publisher, year, edition, pages
Springer Nature, 2024
National Category
Electrical Engineering, Electronic Engineering, Information Engineering Computer Systems
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-360215 (URN)10.1007/s10618-023-00971-3 (DOI)001067646000001 ()2-s2.0-85171464862 (Scopus ID)
Note

QC 20250220

Available from: 2025-02-20 Created: 2025-02-20 Last updated: 2025-02-20Bibliographically approved
Akhavan Rahnama, A. H., Butepage, J. & Boström, H. (2024). Local List-Wise Explanations of LambdaMART. In: Explainable Artificial Intelligence - Second World Conference, xAI 2024, Proceedings: . Paper presented at 2nd World Conference on Explainable Artificial Intelligence, xAI 2024, Valletta, Malta, Jul 17 2024 - Jul 19 2024 (pp. 369-392). Springer Nature
Open this publication in new window or tab >>Local List-Wise Explanations of LambdaMART
2024 (English)In: Explainable Artificial Intelligence - Second World Conference, xAI 2024, Proceedings, Springer Nature , 2024, p. 369-392Conference paper, Published paper (Refereed)
Abstract [en]

LambdaMART, a potent black-box Learning-to-Rank (LTR) model, has been shown to outperform neural network models across tabular ranking benchmark datasets. However, its lack of transparency challenges its application in many real-world domains. Local list-wise explanation techniques provide scores that explain the importance of the features in a list of documents associated with a query to the prediction of black-box LTR models. This study investigates which list-wise explanation techniques provide the most faithful explanations for LambdaMART models. Several local explanation techniques are evaluated for this, i.e., Greedy Score, RankLIME, EXS, LIRME, LIME, and SHAP. Moreover, a non-LTR explanation technique is applied, called Permutation Importance (PMI) to obtain list-wise explanations of LambdaMART. The techniques are compared based on eight evaluation metrics, i.e., Consistency, Completeness, Validity, Fidelity, ExplainNCDG@10, (In)fidelity, Ground Truth, and Feature Frequency similarity. The evaluation is performed on three benchmark datasets: Yahoo, Microsoft Bing Search (MSLR-WEB10K), and LETOR 4 (MQ2008), along with a synthetic dataset. The experimental results show that no single explanation technique is faithful across all datasets and evaluation metrics. Moreover, the explanation techniques tend to be faithful for different subsets of the evaluation metrics; for example, RankLIME out-performs other explanation techniques with respect to Fidelity and ExplainNCDG, while PMI provides the most faithful explanations with respect to Validity and Completeness. Moreover, we show that explanation sample size and the normalization of feature importance scores in explanations can largely affect the faithfulness of explanation techniques across all datasets.

Place, publisher, year, edition, pages
Springer Nature, 2024
Keywords
Explainability for Learning to Rank, Explainable Artificial Intelligence, Explainable Machine Learning, Local explanations, Local list-wise explanations
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-351924 (URN)10.1007/978-3-031-63797-1_19 (DOI)001282234900019 ()2-s2.0-85200663788 (Scopus ID)
Conference
2nd World Conference on Explainable Artificial Intelligence, xAI 2024, Valletta, Malta, Jul 17 2024 - Jul 19 2024
Note

Part of ISBN 9783031637964

QC 20240823

Available from: 2024-08-19 Created: 2024-08-19 Last updated: 2025-02-20Bibliographically approved
Akhavan Rahnama, A. H., Bütepage, J. & Boström, H. (2024). Local Point-Wise Explanations of LambdaMART. In: 14th Scandinavian Conference on Artificial Intelligence SCAI 2024: . Paper presented at 14th Scandinavian Conference on Artificial Intelligence SCAI 2024, Jönköping University, 10-11 Jun 2024.
Open this publication in new window or tab >>Local Point-Wise Explanations of LambdaMART
2024 (English)In: 14th Scandinavian Conference on Artificial Intelligence SCAI 2024, 2024Conference paper, Published paper (Refereed)
Abstract [en]

LambdaMART has been shown to outperform neural network models on tabular Learning-to-Rank (LTR) tasks. Similar to the neural network models, LambdaMART is considered a black-box model due to the complexity of the logic behind its predictions. Explanation techniques can help us understand these models. Our study investigates the faithfulness of point-wise explanation techniques when explaining LambdaMART models. Our analysis includes LTR-specific explanation techniques, such as LIRME and EXS, as well as explanation techniques that are not adapted to LTR use cases, such as LIME, KernelSHAP, and LPI. The explanation techniques are evaluated using several measures: Consistency, Fidelity,(In) fidelity, Validity, Completeness, and Feature Frequency (FF) Similarity. Three LTR benchmark datasets are used in the investigation: LETOR 4 (MQ2008), Microsoft Bing Search (MSLR-WEB10K), and Yahoo! LTR challenge dataset. Our empirical results demonstrate the challenges of accurately explaining LambdaMART: no single explanation technique is consistently faithful across all our evaluation measures and datasets. Furthermore, our results show that LTR-based explanation techniques are not consistently better than their non-LTR-based counterparts across the evaluation measures. Specifically, the LTR-based explanation techniques consistently are the most faithful with respect to (In) fidelity, whereas the non-LTR-specific approaches are shown to frequently provide the most faithful explanations with respect to Validity, Completeness, and FF Similarity.

National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:kth:diva-360218 (URN)
Conference
14th Scandinavian Conference on Artificial Intelligence SCAI 2024, Jönköping University, 10-11 Jun 2024
Note

QC 20250220

Available from: 2025-02-20 Created: 2025-02-20 Last updated: 2025-02-20Bibliographically approved
Gustavsson, O., Ziegler, T., Welle, M. C., Butepage, J., Varava, A. & Kragic, D. (2022). Cloth manipulation based on category classification and landmark detection. International Journal of Advanced Robotic Systems, 19(4), Article ID 17298806221110445.
Open this publication in new window or tab >>Cloth manipulation based on category classification and landmark detection
Show others...
2022 (English)In: International Journal of Advanced Robotic Systems, ISSN 1729-8806, E-ISSN 1729-8814, Vol. 19, no 4, article id 17298806221110445Article in journal (Refereed) Published
Abstract [en]

Cloth manipulation remains a challenging problem for the robotic community. Recently, there has been an increased interest in applying deep learning techniques to problems in the fashion industry. As a result, large annotated data sets for cloth category classification and landmark detection were created. In this work, we leverage these advances in deep learning to perform cloth manipulation. We propose a full cloth manipulation framework that, performs category classification and landmark detection based on an image of a garment, followed by a manipulation strategy. The process is performed iteratively to achieve a stretching task where the goal is to bring a crumbled cloth into a stretched out position. We extensively evaluate our learning pipeline and show a detailed evaluation of our framework on different types of garments in a total of 140 recorded and available experiments. Finally, we demonstrate the benefits of training a network on augmented fashion data over using a small robotic-specific data set.

Place, publisher, year, edition, pages
SAGE Publications, 2022
Keywords
Cloth, garment manipulation, classification, vision for robotics, data augmentation
National Category
Robotics and automation
Identifiers
urn:nbn:se:kth:diva-316295 (URN)10.1177/17298806221110445 (DOI)000834130100001 ()2-s2.0-85134880223 (Scopus ID)
Note

QC 20220812

Available from: 2022-08-12 Created: 2022-08-12 Last updated: 2025-02-09Bibliographically approved
Ziegler, T., Butepage, J., Welle, M. C., Varava, A., Novkovic, T. & Kragic, D. (2020). Fashion Landmark Detection and Category Classification for Robotics. In: Proceedings IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC 2020): . Paper presented at 2020 IEEE International Conference on Autonomous Robot Systems and Competitions, ICARSC 2020, Ponta Delgada, Portugal, April 15-17, 2020.
Open this publication in new window or tab >>Fashion Landmark Detection and Category Classification for Robotics
Show others...
2020 (English)In: Proceedings IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC 2020), 2020Conference paper, Published paper (Refereed)
Abstract [en]

Research on automated, image based identification of clothing categories and fashion landmarks has recently gained significant interest due to its potential impact on areas such as robotic clothing manipulation, automated clothes sorting and recycling, and online shopping. Several public and annotated fashion datasets have been created to facilitate research advances in this direction. In this work, we make the first step towards leveraging the data and techniques developed for fashion image analysis in vision-based robotic clothing manipulation tasks. We focus on techniques that can generalize from large-scale fashion datasets to less structured, small datasets collected in a robotic lab. Specifically, we propose training data augmentation methods such as elastic warping, and model adjustments such as rotation invariant convolutions to make the model generalize better. Our experiments demonstrate that our approach outperforms stateof-the art models with respect to clothing category classification and fashion landmark detection when tested on previously unseen datasets. Furthermore, we present experimental results on a new dataset of images where a robot holds different garments, collected in our lab.

National Category
Robotics and automation Computer graphics and computer vision
Identifiers
urn:nbn:se:kth:diva-282663 (URN)10.1109/ICARSC49921.2020.9096071 (DOI)000587899400015 ()2-s2.0-85085922137 (Scopus ID)
Conference
2020 IEEE International Conference on Autonomous Robot Systems and Competitions, ICARSC 2020, Ponta Delgada, Portugal, April 15-17, 2020
Note

Part of proceedings ISBN 978-1-7281-7078-7

QC 20200930

Available from: 2020-09-30 Created: 2020-09-30 Last updated: 2025-02-05Bibliographically approved
Bütepage, J., Ghadirzadeh, A., Öztimur Karadag, Ö., Björkman, M. & Kragic, D. (2020). Imitating by Generating: Deep Generative Models for Imitation of Interactive Tasks. Frontiers in Robotics and AI, 7, Article ID 47.
Open this publication in new window or tab >>Imitating by Generating: Deep Generative Models for Imitation of Interactive Tasks
Show others...
2020 (English)In: Frontiers in Robotics and AI, E-ISSN 2296-9144, Vol. 7, article id 47Article in journal (Refereed) Published
Abstract [en]

To coordinate actions with an interaction partner requires a constant exchange of sensorimotor signals. Humans acquire these skills in infancy and early childhood mostly by imitation learning and active engagement with a skilled partner. They require the ability to predict and adapt to one's partner during an interaction. In this work we want to explore these ideas in a human-robot interaction setting in which a robot is required to learn interactive tasks from a combination of observational and kinesthetic learning. To this end, we propose a deep learning framework consisting of a number of components for (1) human and robot motion embedding, (2) motion prediction of the human partner, and (3) generation of robot joint trajectories matching the human motion. As long-term motion prediction methods often suffer from the problem of regression to the mean, our technical contribution here is a novel probabilistic latent variable model which does not predict in joint space but in latent space. To test the proposed method, we collect human-human interaction data and human-robot interaction data of four interactive tasks “hand-shake,” “hand-wave,” “parachute fist-bump,” and “rocket fist-bump.” We demonstrate experimentally the importance of predictive and adaptive components as well as low-level abstractions to successfully learn to imitate human behavior in interactive social tasks.

Place, publisher, year, edition, pages
Frontiers Media SA, 2020
Keywords
deep learning, generative models, human-robot interaction, imitation learning, sensorimotor coordination, variational autoencoders
National Category
Robotics and automation
Identifiers
urn:nbn:se:kth:diva-277188 (URN)10.3389/frobt.2020.00047 (DOI)000531230100001 ()33501215 (PubMedID)2-s2.0-85084053889 (Scopus ID)
Note

QC 20200714

Available from: 2020-07-14 Created: 2020-07-14 Last updated: 2025-02-09Bibliographically approved
Ringqvist, C., Butepage, J., Kjellström, H. & Hult, H. (2020). Interpolation in Auto Encoders with Bridge Processes. In: Proceedings of the 25th International Conference on Pattern Recognition, ICPR 2020: . Paper presented at 25th International Conference on Pattern Recognition, ICPR 2020, Virtual Event / Milan, Italy, January 10-15, 2021. Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Interpolation in Auto Encoders with Bridge Processes
2020 (English)In: Proceedings of the 25th International Conference on Pattern Recognition, ICPR 2020, Institute of Electrical and Electronics Engineers (IEEE) , 2020Conference paper, Published paper (Refereed)
Abstract [en]

Auto encoding models have been extensively studied in recent years. They provide an efficient framework for sample generation, as well as for analysing feature learning. Furthermore, they are efficient in performing interpolations between data-points in semantically meaningful ways. In this paper, we introduce a method for generating sequence samples from auto encoders trained on flattened sequences (e.g video sample from auto encoders trained to generate a video frame); as well as a canonical, dimension independent method for generating stochastic interpolations. The distribution of interpolation paths is represented as the distribution of a bridge process constructed from an artificial random data generating process in the latent space, having the prior distribution as its invariant distribution. 

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2020
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:kth:diva-295217 (URN)10.1109/ICPR48806.2021.9413123 (DOI)000678409206015 ()2-s2.0-85104346401 (Scopus ID)
Conference
25th International Conference on Pattern Recognition, ICPR 2020, Virtual Event / Milan, Italy, January 10-15, 2021
Note

Part of book: ISBN 978-1-7281-8808-9

QC 20210519

Available from: 2021-05-18 Created: 2021-05-18 Last updated: 2022-06-25Bibliographically approved
Zhang, C., Butepage, J., Kjellström, H. & Mandt, S. (2019). Advances in Variational Inference. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(8), 2008-2026
Open this publication in new window or tab >>Advances in Variational Inference
2019 (English)In: IEEE Transactions on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539, Vol. 41, no 8, p. 2008-2026Article in journal (Refereed) Published
Abstract [en]

Many modern unsupervised or semi-supervised machine learning algorithms rely on Bayesian probabilistic models. These models are usually intractable and thus require approximate inference. Variational inference (VI) lets us approximate a high-dimensional Bayesian posterior with a simpler variational distribution by solving an optimization problem. This approach has been successfully applied to various models and large-scale applications. In this review, we give an overview of recent trends in variational inference. We first introduce standard mean field variational inference, then review recent advances focusing on the following aspects: (a) scalable VI, which includes stochastic approximations, (b) generic VI, which extends the applicability of VI to a large class of otherwise intractable models, such as non-conjugate models, mean field approximation or with atypical divergences, and (d) amortized VI, which implements the inference over local latent variables with inference networks. Finally, we provide a summary of promising future research directions.

Place, publisher, year, edition, pages
IEEE COMPUTER SOC, 2019
Keywords
Variational inference, approximate Bayesian inference, reparameterization gradients, structured variational approximations, scalable inference, inference networks
National Category
Computational Mathematics
Identifiers
urn:nbn:se:kth:diva-255405 (URN)10.1109/TPAMI.2018.2889774 (DOI)000473598800016 ()30596568 (PubMedID)2-s2.0-85059288228 (Scopus ID)
Note

QC 20190814

Available from: 2019-08-14 Created: 2019-08-14 Last updated: 2024-08-23Bibliographically approved
Butepage, J., Cruciani, S., Kokic, M., Welle, M. & Kragic, D. (2019). From Visual Understanding to Complex Object Manipulation. Annual Review of Control, Robotics, and Autonomous Systems, 2, 161-179
Open this publication in new window or tab >>From Visual Understanding to Complex Object Manipulation
Show others...
2019 (English)In: Annual Review of Control, Robotics, and Autonomous Systems, Vol. 2, p. 161-179Article, review/survey (Refereed) Published
Abstract [en]

Planning and executing object manipulation requires integrating multiple sensory and motor channels while acting under uncertainty and complying with task constraints. As the modern environment is tuned for human hands, designing robotic systems with similar manipulative capabilities is crucial. Research on robotic object manipulation is divided into smaller communities interested in, e.g., motion planning, grasp planning, sensorimotor learning, and tool use. However, few attempts have been made to combine these areas into holistic systems. In this review, we aim to unify the underlying mechanics of grasping and in-hand manipulation by focusing on the temporal aspects of manipulation, including visual perception, grasp planning and execution, and goal-directed manipulation. Inspired by human manipulation, we envision that an emphasis on the temporal integration of these processes opens the way for human-like object use by robots.

Keywords
grasping; in-hand manipulation; task planning
National Category
Robotics and automation
Identifiers
urn:nbn:se:kth:diva-251654 (URN)10.1146/annurev-control-053018-023735 (DOI)000467686900007 ()2-s2.0-85074408594 (Scopus ID)
Note

QC 20190605

Available from: 2019-05-17 Created: 2019-05-17 Last updated: 2025-02-09Bibliographically approved
Bütepage, J. (2019). Generative models for action generation and action understanding. (Doctoral dissertation). Stockholm: KTH Royal Institute of Technology
Open this publication in new window or tab >>Generative models for action generation and action understanding
2019 (English)Doctoral thesis, comprehensive summary (Other academic)
Alternative title[sv]
Generativa modeller för generering och förståelse av mänsklig aktivitet
Abstract [en]

The question of how to build intelligent machines raises the question of how to rep-resent the world to enable intelligent behavior. In nature, this representation relies onthe interplay between an organism’s sensory input and motor input. Action-perceptionloops allow many complex behaviors to arise naturally. In this work, we take these sen-sorimotor contingencies as an inspiration to build robot systems that can autonomouslyinteract with their environment and with humans. The goal is to pave the way for robotsystems that can learn motor control in an unsupervised fashion and relate their ownsensorimotor experience to observed human actions. By combining action generationand action understanding we hope to facilitate smooth and intuitive interaction betweenrobots and humans in shared work spaces.To model robot sensorimotor contingencies and human behavior we employ gen-erative models. Since generative models represent a joint distribution over relevantvariables, they are flexible enough to cover the range of tasks that we are tacklinghere. Generative models can represent variables that originate from multiple modali-ties, model temporal dynamics, incorporate latent variables and represent uncertaintyover any variable - all of which are features required to model sensorimotor contin-gencies. By using generative models, we can predict the temporal development of thevariables in the future, which is important for intelligent action selection.We present two lines of work. Firstly, we will focus on unsupervised learning ofmotor control with help of sensorimotor contingencies. Based on Gaussian Processforward models we demonstrate how the robot can execute goal-directed actions withthe help of planning techniques or reinforcement learning. Secondly, we present anumber of approaches to model human activity, ranging from pure unsupervised mo-tion prediction to including semantic action and affordance labels. Here we employdeep generative models, namely Variational Autoencoders, to model the 3D skeletalpose of humans over time and, if required, include semantic information. These twolines of work are then combined to implement physical human-robot interaction tasks.Our experiments focus on real-time applications, both when it comes to robot ex-periments and human activity modeling. Since many real-world scenarios do not haveaccess to high-end sensors, we require our models to cope with uncertainty. Additionalrequirements are data-efficient learning, because of the wear and tear of the robot andhuman involvement, online employability and operation under safety and complianceconstraints. We demonstrate how generative models of sensorimotor contingencies canhandle these requirements in our experiments satisfyingly.

Abstract [sv]

Frågan om hur man bygger intelligenta maskiner väcker frågan om hur man kanrepresentera världen för att möjliggöra intelligent beteende. I naturen bygger en sådanrepresentation på samspelet mellan en organisms sensoriska intryck och handlingar.Kopplingar mellan sinnesintryck och handlingar gör att många komplexa beteendenkan uppstå naturligt. I detta arbete tar vi dessa sensorimotoriska kopplingar som eninspiration för att bygga robotarsystem som autonomt kan interagera med sin miljöoch med människor. Målet är att bana väg för robotarsystem som självständiga kan lärasig att kontrollera sina rörelser och relatera sina egen sensorimotoriska upplevelser tillobserverade mänskliga handlingar. Genom att relatera robotens rörelser och förståelsenav mänskliga handlingar, hoppas vi kunna underlätta smidig och intuitiv interaktionmellan robotar och människor.För att modellera robotens sensimotoriska kopplingar och mänskligt beteende an-vänder vi generativa modeller. Eftersom generativa modeller representerar en multiva-riat fördelning över relevanta variabler, är de tillräckligt flexibla för att uppfylla demkrav som vi ställer här. Generativa modeller kan representera variabler från olika mo-daliteter, modellera temporala dynamiska system, modellera latenta variabler och re-presentera variablers varians - alla dessa egenskaper är nödvändiga för att modellerasensorimotoriska kopplingar. Genom att använda generativa modeller kan vi förutseutvecklingen av variablerna i framtiden, vilket är viktigt för att ta intelligenta beslut.Vi presenterar arbete som går i två riktningar. För det första kommer vi att fokuserapå självständig inlärande av rörelse kontroll med hjälp av sensorimotoriska kopplingar.Baserat på Gaussian Process forward modeller visar vi hur roboten kan röra på sigmot ett mål med hjälp av planeringstekniker eller förstärkningslärande. För det andrapresenterar vi ett antal tillvägagångssätt för att modellera mänsklig aktivitet, allt frånatt förutse hur människan kommer röra på sig till att inkludera semantisk information.Här använder vi djupa generativa modeller, nämligen Variational Autoencoders, föratt modellera 3D-skelettpositionen av människor över tid och, om så krävs, inkluderasemantisk information. Dessa två ideer kombineras sedan för att hjälpa roboten attinteragera med människan.Våra experiment fokuserar på realtidsscenarion, både när det gäller robot experi-ment och mänsklig aktivitet modellering. Eftersom många verkliga scenarier inte hartillgång till avancerade sensorer, kräver vi att våra modeller hanterar osäkerhet. Yt-terligare krav är maskininlärningsmodeller som inte behöver mycket data, att systemsfungerar i realtid och under säkerhetskrav. Vi visar hur generativa modeller av senso-rimotoriska kopplingar kan hantera dessa krav i våra experiment tillfredsställande.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2019. p. 41
Series
TRITA-EECS-AVL ; 2019:60
National Category
Robotics and automation
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-256002 (URN)978-91-7873-246-3 (ISBN)
Public defence
2019-09-12, F3, Lindstedtsvägen 26, Stockholm, 13:00 (English)
Opponent
Supervisors
Funder
EU, Horizon 2020, socsmcs
Note

QC 20190816

Available from: 2019-08-16 Created: 2019-08-15 Last updated: 2025-02-09Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-5344-8042

Search in DiVA

Show all publications