Change search
Refine search result
1 - 16 of 16
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Jonell, Patrik
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Mattias, Bystedt
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Per, Fallgren
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Kontogiorgos, Dimosthenis
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    David Aguas Lopes, José
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Malisz, Zofia
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Samuel, Mascarenhas
    GAIPS INESC-ID, Lisbon, Portugal.
    Oertel, Catharine
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Eran, Raveh
    Multimodal Computing and Interaction, Saarland University, Germany.
    Shore, Todd
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    FARMI: A Framework for Recording Multi-Modal Interactions2018In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Paris: European Language Resources Association, 2018, p. 3969-3974Conference paper (Refereed)
    Abstract [en]

    In this paper we present (1) a processing architecture used to collect multi-modal sensor data, both for corpora collection and real-time processing, (2) an open-source implementation thereof and (3) a use-case where we deploy the architecture in a multi-party deception game, featuring six human players and one robot. The architecture is agnostic to the choice of hardware (e.g. microphones, cameras, etc.) and programming languages, although our implementation is mostly written in Python. In our use-case, different methods of capturing verbal and non-verbal cues from the participants were used. These were processed in real-time and used to inform the robot about the participants’ deceptive behaviour. The framework is of particular interest for researchers who are interested in the collection of multi-party, richly recorded corpora and the design of conversational systems. Moreover for researchers who are interested in human-robot interaction the available modules offer the possibility to easily create both autonomous and wizard-of-Oz interactions.

  • 2.
    Jonell, Patrik
    et al.
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
    Oertel, Catharine
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
    Kontogiorgos, Dimosthenis
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
    Beskow, Jonas
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
    Gustafson, Joakim
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
    Crowd-powered design of virtual attentive listeners2017In: 17th International Conference on Intelligent Virtual Agents, IVA 2017, Springer, 2017, Vol. 10498, p. 188-191Conference paper (Refereed)
    Abstract [en]

    This demo presents a web-based system that generates attentive listening behaviours in a virtual agent acquired from audio-visual recordings of attitudinal feedback behaviour of crowdworkers.

  • 3.
    Jonell, Patrik
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Oertel, Catharine
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Kontogiorgos, Dimosthenis
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Beskow, Jonas
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Gustafson, Joakim
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Crowdsourced Multimodal Corpora Collection Tool2018In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Paris, 2018, p. 728-734Conference paper (Refereed)
    Abstract [en]

    In recent years, more and more multimodal corpora have been created. To our knowledge there is no publicly available tool which allows for acquiring controlled multimodal data of people in a rapid and scalable fashion. We therefore are proposing (1) a novel tool which will enable researchers to rapidly gather large amounts of multimodal data spanning a wide demographic range, and (2) an example of how we used this tool for corpus collection of our "Attentive listener'' multimodal corpus. The code is released under an Apache License 2.0 and available as an open-source repository, which can be found at https://github.com/kth-social-robotics/multimodal-crowdsourcing-tool. This tool will allow researchers to set-up their own multimodal data collection system quickly and create their own multimodal corpora. Finally, this paper provides a discussion about the advantages and disadvantages with a crowd-sourced data collection tool, especially in comparison to a lab recorded corpora.

  • 4.
    Kontogiorgos, Dimosthenis
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Multimodal Language Grounding for Human-Robot Collaboration: YRRSDS 2019 - Dimosthenis Kontogiorgos2019In: Young Researchers Roundtable on Spoken Dialogue Systems, 2019Conference paper (Refereed)
  • 5.
    Kontogiorgos, Dimosthenis
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
    Multimodal language grounding for improved human-robot collaboration: Exploring Spatial Semantic Representations in the Shared Space of Attention2017In: ICMI 2017 - Proceedings of the 19th ACM International Conference on Multimodal Interaction, ACM Digital Library, 2017, Vol. 2017, p. 660-664Conference paper (Refereed)
    Abstract [en]

    There is an increased interest in arti cially intelligent technology that surrounds us and takes decisions on our behalf. This creates the need for such technology to be able to communicate with humans and understand natural language and non-verbal behaviour that may carry information about our complex physical world. Arti cial agents today still have little knowledge about the physical space that surrounds us and about the objects or concepts within our attention. We are still lacking computational methods in understanding the context of human conversation that involves objects and locations around us. Can we use multimodal cues from human perception of the real world as an example of language learning for robots? Can arti cial agents and robots learn about the physical world by observing how humans interact with it and how they refer to it and attend during their conversations? This PhD project’s focus is on combining spoken language and non-verbal behaviour extracted by multi-party dialogue in order to increase context awareness and spatial understanding for arti cial agents.

  • 6.
    Kontogiorgos, Dimosthenis
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Abelho Pereira, André Tiago
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Andersson, Olle
    KTH.
    Koivisto, Marco
    KTH.
    Gonzalez Rabal, Elena
    KTH.
    Vartiainen, Ville
    KTH.
    Gustafson, Joakim
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    The effects of anthropomorphism and non-verbal social behaviour in virtual assistants2019In: IVA 2019 - Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents, Association for Computing Machinery (ACM), 2019, p. 133-140Conference paper (Refereed)
    Abstract [en]

    The adoption of virtual assistants is growing at a rapid pace. However, these assistants are not optimised to simulate key social aspects of human conversational environments. Humans are intellectually biased toward social activity when facing anthropomorphic agents or when presented with subtle social cues. In this paper, we test whether humans respond the same way to assistants in guided tasks, when in different forms of embodiment and social behaviour. In a within-subject study (N=30), we asked subjects to engage in dialogue with a smart speaker and a social robot.We observed shifting of interactive behaviour, as shown in behavioural and subjective measures. Our findings indicate that it is not always favourable for agents to be anthropomorphised or to communicate with nonverbal cues. We found a trade-off between task performance and perceived sociability when controlling for anthropomorphism and social behaviour.

  • 7.
    Kontogiorgos, Dimosthenis
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Abelho Pereira, André Tiago
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Gustafson, Joakim
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    The Trade-off between Interaction Time and Social Facilitation with Collaborative Social Robots2019In: The Challenges of Working on Social Robots that Collaborate with People, 2019Conference paper (Refereed)
    Abstract [en]

    The adoption of social robots and conversational agents is growing at a rapid pace. These agents, however, are still not optimised to simulate key social aspects of situated human conversational environments. Humans are intellectually biased towards social activity when facing more anthropomorphic agents or when presented with subtle social cues. In this paper, we discuss the effects of simulating anthropomorphism and non-verbal social behaviour in social robots and its implications for human-robot collaborative guided tasks. Our results indicate that it is not always favourable for agents to be anthropomorphised or to communicate with nonverbal behaviour. We found a clear trade-off between interaction time and social facilitation when controlling for anthropomorphism and social behaviour.

  • 8.
    Kontogiorgos, Dimosthenis
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Avramova, Vanya
    KTH.
    Alexanderson, Simon
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Jonell, Patrik
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Oertel, Catharine
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH. Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland.
    Beskow, Jonas
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Skantze, Gabriel
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Gustafson, Joakim
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction2018In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Paris, 2018, p. 119-127Conference paper (Refereed)
    Abstract [en]

    In this paper we present a corpus of multiparty situated interaction where participants collaborated on moving virtual objects on a large touch screen. A moderator facilitated the discussion and directed the interaction. The corpus contains recordings of a variety of multimodal data, in that we captured speech, eye gaze and gesture data using a multisensory setup (wearable eye trackers, motion capture and audio/video). Furthermore, in the description of the multimodal corpus, we investigate four different types of social gaze: referential gaze, joint attention, mutual gaze and gaze aversion by both perspectives of a speaker and a listener. We annotated the groups’ object references during object manipulation tasks and analysed the group’s proportional referential eye-gaze with regards to the referent object. When investigating the distributions of gaze during and before referring expressions we could corroborate the differences in time between speakers’ and listeners’ eye gaze found in earlier studies. This corpus is of particular interest to researchers who are interested in social eye-gaze patterns in turn-taking and referring language in situated multi-party interaction.

  • 9.
    Kontogiorgos, Dimosthenis
    et al.
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
    Manikas, Konstantinos
    University of Copenhagen.
    Towards identifying programming expertise with the use of physiological measures2015Conference paper (Refereed)
    Abstract [en]

    In this position paper we propose means of measuring pro- gramming expertise on novice and expert programmers. Our approach is to measure the cognitive load of programmers while they assess Java/Python code in accordance with their experience in programming. Our hypothesis is that expert programmers encounter smaller pupillary dilation within pro- gramming problem solving tasks. We aim to evaluate our hypothesis using the EMIP Distributed Data Collection in order to confirm or reject our approach. 

  • 10.
    Kontogiorgos, Dimosthenis
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
    Pereira, Andre
    Gustafson, Joakim
    Estimating Uncertainty in Task Oriented Dialogue2019In: ICMI 2019 - Proceedings of the 2019 International Conference on Multimodal Interaction, 2019Conference paper (Refereed)
    Abstract [en]

    Situated multimodal systems that instruct humans need to handle user uncertainties, as expressed in behaviour, and plan their actions accordingly. Speakers’ decision to reformulate or repair previous utterances depends greatly on the listeners’ signals of uncertainty. In this paper, we estimate uncertainty in a situated guided task, as leveraged in non-verbal cues expressed by the listener, and predict that the speaker will reformulate their utterance. We use a corpus where people instruct how to assemble furniture, and extract their multimodal features. While uncertainty is in cases ver- bally expressed, most instances are expressed non-verbally, which indicates the importance of multimodal approaches. In this work, we present a model for uncertainty estimation. Our findings indicate that uncertainty estimation from non- verbal cues works well, and can exceed human annotator performance when verbal features cannot be perceived.

  • 11.
    Kontogiorgos, Dimosthenis
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Sibirtseva, Elena
    KTH, School of Electrical Engineering and Computer Science (EECS), Robotics, perception and learning, RPL.
    Pereira, André
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Skantze, Gabriel
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Gustafson, Joakim
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Multimodal reference resolution in collaborative assembly tasks2018In: Multimodal reference resolution in collaborative assembly tasks, ACM Digital Library, 2018Conference paper (Refereed)
  • 12.
    Kontogiorgos, Dimosthenis
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Skantze, Gabriel
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Abelho Pereira, André Tiago
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Gustafson, Joakim
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    The Effects of Embodiment and Social Eye-Gaze in Conversational Agents2019In: Proceedings of the 41st Annual Conference of the Cognitive Science Society (CogSci), 2019Conference paper (Refereed)
    Abstract [en]

    The adoption of conversational agents is growing at a rapid pace. Agents however, are not optimised to simulate key social aspects of situated human conversational environments. Humans are intellectually biased towards social activity when facing more anthropomorphic agents or when presented with subtle social cues. In this work, we explore the effects of simulating anthropomorphism and social eye-gaze in three conversational agents. We tested whether subjects’ visual attention would be similar to agents in different forms of embodiment and social eye-gaze. In a within-subject situated interaction study (N=30), we asked subjects to engage in task-oriented dialogue with a smart speaker and two variations of a social robot. We observed shifting of interactive behaviour by human users, as shown in differences in behavioural and objective measures. With a trade-off in task performance, social facilitation is higher with more anthropomorphic social agents when performing the same task.

  • 13.
    Manikas, Konstantinos
    et al.
    Department of Computer Science, University of Copenhagen, Denmark.
    Kontogiorgos, Dimosthenis
    Department of Computer Science, University of Copenhagen, Denmark.
    Characterizing Software Activity: The Influence of Software to Ecosystem Health2015In: ECSAW '15 Proceedings of the 2015 European Conference on Software Architecture Workshops, ACM Digital Library, 2015Conference paper (Refereed)
    Abstract [en]

    The health of a software ecosystem reflects the ability of the ecosystem to endure and remain variable and productive over time. Measurements of health in a software ecosystem are applied to inform on how the ecosystem is evolving, evaluate changes, and predict future states. In this study we investigate the influence of software to the overall health of the ecosystem. We propose an approach of measuring the activity of the ecosystem over time and identifying the influence to health. We do this in two ways: (i) we study the evolution of the software network over time to identify changes in the structure of the software network and investigate whether they relate to general changes in the ecosystem. (ii) We propose the identification of the influence of the independent software components to ecosystem health at an activity level. We do so by defining keystone and dominator activities and propose tentative means of measuring them. We apply our proposed approach to the platform of the Apache Cordova ecosystem, an ecosystem with a community-based platform and independent contributions. Our analysis identifies two points in time where the ecosystem is under major change. These points are confirmed independently by both the measures of software network and keystone and dominator activities.

  • 14.
    Oertel, Catharine
    et al.
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
    Jonell, Patrik
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
    Kontogiorgos, Dimosthenis
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
    Mendelson, Joseph
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
    Beskow, Jonas
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
    Gustafson, Joakim
    KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
    Crowdsourced design of artificial attentive listeners2017Conference paper (Refereed)
  • 15.
    Sibirtseva, Elena
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Robotics, perception and learning, RPL.
    Kontogiorgos, Dimosthenis
    KTH, School of Electrical Engineering and Computer Science (EECS), Robotics, perception and learning, RPL.
    Nykvist, Olov
    KTH, School of Electrical Engineering and Computer Science (EECS), Robotics, perception and learning, RPL.
    Karaoguz, Hakan
    KTH, School of Electrical Engineering and Computer Science (EECS), Robotics, perception and learning, RPL.
    Leite, Iolanda
    KTH, School of Electrical Engineering and Computer Science (EECS), Robotics, perception and learning, RPL.
    Gustafson, Joakim
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Kragic, Danica
    KTH, School of Electrical Engineering and Computer Science (EECS), Robotics, perception and learning, RPL.
    A Comparison of Visualisation Methods for Disambiguating Verbal Requests in Human-Robot Interaction2018In: 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN) 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), 2018Conference paper (Refereed)
    Abstract [en]

    Picking up objects requested by a human user is a common task in human-robot interaction. When multiple objects match the user's verbal description, the robot needs to clarify which object the user is referring to before executing the action. Previous research has focused on perceiving user's multimodal behaviour to complement verbal commands or minimising the number of follow up questions to reduce task time. In this paper, we propose a system for reference disambiguation based on visualisation and compare three methods to disambiguate natural language instructions. In a controlled experiment with a YuMi robot, we investigated realtime augmentations of the workspace in three conditions - head-mounted display, projector, and a monitor as the baseline - using objective measures such as time and accuracy, and subjective measures like engagement, immersion, and display interference. Significant differences were found in accuracy and engagement between the conditions, but no differences were found in task time. Despite the higher error rates in the head-mounted display condition, participants found that modality more engaging than the other two, but overall showed preference for the projector condition over the monitor and head-mounted display conditions.

  • 16.
    Stefanov, Kalin
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Salvi, Giampiero
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Kontogiorgos, Dimosthenis
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Kjellström, Hedvig
    KTH, School of Electrical Engineering and Computer Science (EECS), Robotics, Perception and Learning, RPL.
    Beskow, Jonas
    KTH, School of Electrical Engineering and Computer Science (EECS), Speech, Music and Hearing, TMH.
    Modeling of Human Visual Attention in Multiparty Open-World Dialogues2019In: ACM TRANSACTIONS ON HUMAN-ROBOT INTERACTION, ISSN 2573-9522, Vol. 8, no 2, article id UNSP 8Article in journal (Refereed)
    Abstract [en]

    This study proposes, develops, and evaluates methods for modeling the eye-gaze direction and head orientation of a person in multiparty open-world dialogues, as a function of low-level communicative signals generated by his/hers interlocutors. These signals include speech activity, eye-gaze direction, and head orientation, all of which can be estimated in real time during the interaction. By utilizing these signals and novel data representations suitable for the task and context, the developed methods can generate plausible candidate gaze targets in real time. The methods are based on Feedforward Neural Networks and Long Short-Term Memory Networks. The proposed methods are developed using several hours of unrestricted interaction data and their performance is compared with a heuristic baseline method. The study offers an extensive evaluation of the proposed methods that investigates the contribution of different predictors to the accurate generation of candidate gaze targets. The results show that the methods can accurately generate candidate gaze targets when the person being modeled is in a listening state. However, when the person being modeled is in a speaking state, the proposed methods yield significantly lower performance.

1 - 16 of 16
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf