kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
TurnGPT: a Transformer-based Language Model for Predicting Turn-taking in Spoken Dialog
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0003-3513-4132
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0002-8579-1790
2020 (English)In: Findings of the Association for Computational Linguistics: EMNLP 2020, Online: Association for Computational Linguistics (ACL) , 2020, p. 2981-2990Conference paper, Published paper (Refereed)
Abstract [en]

Syntactic and pragmatic completeness is known to be important for turn-taking prediction, but so far machine learning models of turn-taking have used such linguistic information in a limited way. In this paper, we introduce TurnGPT, a transformer-based language model for predicting turn-shifts in spoken dialog. The model has been trained and evaluated on a variety of written and spoken dialog datasets. We show that the model outperforms two baselines used in prior work. We also report on an ablation study, as well as attention and gradient analyses, which show that the model is able to utilize the dialog context and pragmatic completeness for turn-taking prediction. Finally, we explore the model’s potential in not only detecting, but also projecting, turn-completions.

Place, publisher, year, edition, pages
Online: Association for Computational Linguistics (ACL) , 2020. p. 2981-2990
Keywords [en]
turn-taking, generative, natural language processing, dialog systems, conversational systems
National Category
Computer Systems
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-296175DOI: 10.18653/v1/2020.findings-emnlp.268Scopus ID: 2-s2.0-85098712394OAI: oai:DiVA.org:kth-296175DiVA, id: diva2:1558642
Conference
Findings of the Association for Computational Linguistics, ACL 2020: EMNLP 2020, Virtual, Online, 16 November 2020 through 20 November 2020
Funder
Swedish Foundation for Strategic Research, RIT15-0133Swedish Research Council, 2013-1403
Note

QC 20220613

Part of proceedings: ISBN 978-195214890-3

Available from: 2021-05-31 Created: 2021-05-31 Last updated: 2022-06-25Bibliographically approved

Open Access in DiVA

fulltext(595 kB)402 downloads
File information
File name FULLTEXT01.pdfFile size 595 kBChecksum SHA-512
3c8dae4417326ef3744f4dd6e185533db373712a6fb1b1944b75be6f0f74516932d9ee70f42c1e0d08b7f91ee20e0fb15b1da2830d9ff5e6e169bc17fbd14efe
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopusPublisher

Authority records

Ekstedt, ErikSkantze, Gabriel

Search in DiVA

By author/editor
Ekstedt, ErikSkantze, Gabriel
By organisation
Speech, Music and Hearing, TMH
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 403 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 412 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf