kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Grounding behaviours with conversational interfaces: effects of embodiment and failures
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0002-8874-6629
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0003-2428-0468
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0002-0397-6442
2021 (English)In: Journal on Multimodal User Interfaces, ISSN 1783-7677, E-ISSN 1783-8738, Vol. 15, no 2, p. 239-254Article in journal (Refereed) Published
Abstract [en]

Conversational interfaces that interact with humans need to continuously establish, maintain and repair common ground in task-oriented dialogues. Uncertainty, repairs and acknowledgements are expressed in user behaviour in the continuous efforts of the conversational partners to maintain mutual understanding. Users change their behaviour when interacting with systems in different forms of embodiment, which affects the abilities of these interfaces to observe users’ recurrent social signals. Additionally, humans are intellectually biased towards social activity when facing anthropomorphic agents or when presented with subtle social cues. Two studies are presented in this paper examining how humans interact in a referential communication task with wizarded interfaces in different forms of embodiment. In study 1 (N = 30), we test whether humans respond the same way to agents, in different forms of embodiment and social behaviour. In study 2 (N = 44), we replicate the same task and agents but introduce conversational failures disrupting the process of grounding. Findings indicate that it is not always favourable for agents to be anthropomorphised or to communicate with non-verbal cues, as human grounding behaviours change when embodiment and failures are manipulated.

Place, publisher, year, edition, pages
Springer Nature , 2021. Vol. 15, no 2, p. 239-254
National Category
Human Computer Interaction
Identifiers
URN: urn:nbn:se:kth:diva-295461DOI: 10.1007/s12193-021-00366-yISI: 000632299500001Scopus ID: 2-s2.0-85103164370OAI: oai:DiVA.org:kth-295461DiVA, id: diva2:1556215
Note

QC 20250331

Available from: 2021-05-20 Created: 2021-05-20 Last updated: 2025-03-31Bibliographically approved
In thesis
1. Mutual Understanding in Situated Interactions with Conversational User Interfaces: Theory, Studies, and Computation
Open this publication in new window or tab >>Mutual Understanding in Situated Interactions with Conversational User Interfaces: Theory, Studies, and Computation
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

This dissertation presents advances in HCI through a series of studies focusing on task-oriented interactions between humans and between humans and machines. The notion of mutual understanding is central, also known as grounding in psycholinguistics, in particular how people establish understanding in conversations and what interactional phenomena are present in that process. Addressing the gap in computational models of understanding, interactions in this dissertation are observed through multisensory input and evaluated with statistical and machine-learning models. As it becomes apparent, miscommunication is ordinary in human conversations and therefore embodied computer interfaces interacting with humans are subject to a large number of conversational failures. Investigating how these inter- faces can evaluate human responses to distinguish whether spoken utterances are understood is one of the central contributions of this thesis.

The first papers (Papers A and B) included in this dissertation describe studies on how humans establish understanding incrementally and how they co-produce utterances to resolve misunderstandings in joint-construction tasks. Utilising the same interaction paradigm from such human-human settings, the remaining papers describe collaborative interactions between humans and machines with two central manipulations: embodiment (Papers C, D, E, and F) and conversational failures (Papers D, E, F, and G). The methods used investigate whether embodiment affects grounding behaviours among speakers and what verbal and non-verbal channels are utilised in response and recovery to miscommunication. For application to robotics and conversational user interfaces, failure detection systems are developed predicting in real-time user uncertainty, paving the way for new multimodal computer interfaces that are aware of dialogue breakdown and system failures.

Through the lens of Theory, Studies, and Computation, a comprehensive overview is presented on how mutual understanding has been observed in interactions with humans and between humans and machines. A summary of literature in mutual understanding from psycholinguistics and human-computer interaction perspectives is reported. An overview is also presented on how prior knowledge in mutual understanding has and can be observed through experimentation and empirical studies, along with perspectives of how knowledge acquired through observation is put into practice through the analysis and development of computational models. Derived from literature and empirical observations, the central thesis of this dissertation is that embodiment and mutual understanding are intertwined in task-oriented interactions, both in successful communication but also in situations of miscommunication.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2022. p. xxi, 139
Series
TRITA-EECS-AVL ; 2022-10
Keywords
human-computer interaction, social robots, smart-speakers, multimodal behaviours, social signal processing, common ground, dialogue and discourse, joint-construction tasks, embodiment, conversational failures
National Category
Human Computer Interaction
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-308927 (URN)978-91-8040-137-1 (ISBN)
Public defence
2022-03-11, https://kth-se.zoom.us/j/62813774919, Kollegiesalen, Brinellvägen 8, Stockholm, 14:00 (English)
Opponent
Supervisors
Note

QC 20220216

Available from: 2022-02-16 Created: 2022-02-15 Last updated: 2022-06-25Bibliographically approved

Open Access in DiVA

fulltext(2290 kB)312 downloads
File information
File name FULLTEXT01.pdfFile size 2290 kBChecksum SHA-512
a58ea0fa2fe92dbc391a937b8dade798cc39c628481367d84943977b05be83c669400b1139f8764af2849596a5d6f6c900099def0a359ed09b03c2418ec5fc96
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Kontogiorgos, DimosthenisAbelho Pereira, André TiagoGustafsson, Joakim

Search in DiVA

By author/editor
Kontogiorgos, DimosthenisAbelho Pereira, André TiagoGustafsson, Joakim
By organisation
Speech, Music and Hearing, TMH
In the same journal
Journal on Multimodal User Interfaces
Human Computer Interaction

Search outside of DiVA

GoogleGoogle Scholar
Total: 312 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 540 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf