Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Multimodal Continuous Turn-Taking Prediction Using Multiscale RNNs
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.ORCID iD: 0000-0002-8579-1790
2018 (English)In: ICMI 2018 - Proceedings of the 2018 International Conference on Multimodal Interaction, 2018, p. 186-190Conference paper, Published paper (Refereed)
Abstract [en]

In human conversational interactions, turn-taking exchanges can be coordinated using cues from multiple modalities. To design spoken dialog systems that can conduct fluid interactions it is desirable to incorporate cues from separate modalities into turn-taking models. We propose that there is an appropriate temporal granularity at which modalities should be modeled. We design a multiscale RNN architecture to model modalities at separate timescales in a continuous manner. Our results show that modeling linguistic and acoustic features at separate temporal rates can be beneficial for turn-taking modeling. We also show that our approach can be used to incorporate gaze features into turn-taking models.

Place, publisher, year, edition, pages
2018. p. 186-190
National Category
Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:kth:diva-241286DOI: 10.1145/3242969.3242997ISI: 000457913100027Scopus ID: 2-s2.0-85056659938ISBN: 9781450356923 (print)OAI: oai:DiVA.org:kth-241286DiVA, id: diva2:1280135
Conference
20th ACM International Conference on Multimodal Interaction, ICMI 2018, University of Colorado BoulderBoulder, United States, 16 October 2018 through 20 October 2018)
Note

QC 20190215

Available from: 2019-01-18 Created: 2019-01-18 Last updated: 2019-02-22Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records BETA

Skantze, Gabriel

Search in DiVA

By author/editor
Skantze, Gabriel
By organisation
Speech, Music and Hearing, TMH
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 11 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf