kth.sePublications
Change search
Link to record
Permanent link

Direct link
Publications (3 of 3) Show all publications
Qian, L. & Skantze, G. (2024). Joint Learning of Context and Feedback Embeddings in Spoken Dialogue. In: Interspeech 2024: . Paper presented at 25th Interspeech Conferece 2024, Kos Island, Greece, Sep 1 2024 - Sep 5 2024 (pp. 2955-2959). International Speech Communication Association
Open this publication in new window or tab >>Joint Learning of Context and Feedback Embeddings in Spoken Dialogue
2024 (English)In: Interspeech 2024, International Speech Communication Association , 2024, p. 2955-2959Conference paper, Published paper (Refereed)
Abstract [en]

Short feedback responses, such as backchannels, play an important role in spoken dialogue. So far, most of the modeling of feedback responses has focused on their timing, often neglecting how their lexical and prosodic form influence their contextual appropriateness and conversational function. In this paper, we investigate the possibility of embedding short dialogue contexts and feedback responses in the same representation space using a contrastive learning objective. In our evaluation, we primarily focus on how such embeddings can be used as a context-feedback appropriateness metric and thus for feedback response ranking in U.S. English dialogues. Our results show that the model outperforms humans given the same ranking task and that the learned embeddings carry information about the conversational function of feedback responses.

Place, publisher, year, edition, pages
International Speech Communication Association, 2024
Keywords
backchannel, contrastive learning, conversational systems, dialogue, feedback, function, representation learning, unsupervised learning
National Category
Natural Language Processing Computer Sciences
Identifiers
urn:nbn:se:kth:diva-358871 (URN)10.21437/Interspeech.2024-1082 (DOI)2-s2.0-85214790716 (Scopus ID)
Conference
25th Interspeech Conferece 2024, Kos Island, Greece, Sep 1 2024 - Sep 5 2024
Projects
tmh_feedback
Note

QC 20250127

Available from: 2025-01-23 Created: 2025-01-23 Last updated: 2025-01-28Bibliographically approved
Willemsen, B., Qian, L. & Skantze, G. (2023). Resolving References in Visually-Grounded Dialogue via Text Generation. In: David Schlangen, Svetlana Stoyanchev, Shafiq Joty, Ondrej Dusek, Casey Kennington, Malihe Alikhani (Ed.), Proceedings of the 24th Meeting of the Special Interest Group on Discourse and Dialogue: . Paper presented at The 24th Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL 2023), Prague, Czechia, 11 - 15 September (pp. 457-469). Prague, Czechia: Association for Computational Linguistics (ACL)
Open this publication in new window or tab >>Resolving References in Visually-Grounded Dialogue via Text Generation
2023 (English)In: Proceedings of the 24th Meeting of the Special Interest Group on Discourse and Dialogue / [ed] David Schlangen, Svetlana Stoyanchev, Shafiq Joty, Ondrej Dusek, Casey Kennington, Malihe Alikhani, Prague, Czechia: Association for Computational Linguistics (ACL) , 2023, p. 457-469Conference paper, Published paper (Refereed)
Abstract [en]

Vision-language models (VLMs) have shown to be effective at image retrieval based on simple text queries, but text-image retrieval based on conversational input remains a challenge. Consequently, if we want to use VLMs for reference resolution in visually-grounded dialogue, the discourse processing capabilities of these models need to be augmented. To address this issue, we propose fine-tuning a causal large language model (LLM) to generate definite descriptions that summarize coreferential information found in the linguistic context of references. We then use a pretrained VLM to identify referents based on the generated descriptions, zero-shot. We evaluate our approach on a manually annotated dataset of visually-grounded dialogues and achieve results that, on average, exceed the performance of the baselines we compare against. Furthermore, we find that using referent descriptions based on larger context windows has the potential to yield higher returns.

Place, publisher, year, edition, pages
Prague, Czechia: Association for Computational Linguistics (ACL), 2023
National Category
Natural Language Processing
Research subject
Computer Science; Human-computer Interaction
Identifiers
urn:nbn:se:kth:diva-339204 (URN)001274996900041 ()
Conference
The 24th Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL 2023), Prague, Czechia, 11 - 15 September
Projects
tmh_grounding
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

QC 20231106

Part of ISBN 979-8-89176-028-8

Available from: 2023-11-04 Created: 2023-11-04 Last updated: 2025-02-07Bibliographically approved
Qian, L. (2023). The Future of Designing Spoken Dialogue Systems and Analyzing Written Conversations. In: YRRSDS 2023 - 19th Annual Meeting of the Young Researchers' Roundtable on Spoken Dialogue Systems, Proceedings of the Workshop: . Paper presented at 19th Annual Meeting of the Young Researchers' Roundtable on Spoken Dialogue Systems, YRRSDS 2023, Prague, Czechia, Sep 11 2023 - Sep 12 2023 (pp. 33-34). Association for Computational Linguistics (ACL)
Open this publication in new window or tab >>The Future of Designing Spoken Dialogue Systems and Analyzing Written Conversations
2023 (English)In: YRRSDS 2023 - 19th Annual Meeting of the Young Researchers' Roundtable on Spoken Dialogue Systems, Proceedings of the Workshop, Association for Computational Linguistics (ACL) , 2023, p. 33-34Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
Association for Computational Linguistics (ACL), 2023
National Category
Languages and Literature
Identifiers
urn:nbn:se:kth:diva-343747 (URN)2-s2.0-85184801881 (Scopus ID)
Conference
19th Annual Meeting of the Young Researchers' Roundtable on Spoken Dialogue Systems, YRRSDS 2023, Prague, Czechia, Sep 11 2023 - Sep 12 2023
Note

QC 20240304

Part of ISBN 9781952148255

Available from: 2024-02-22 Created: 2024-02-22 Last updated: 2024-03-04Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-7885-5477

Search in DiVA

Show all publications