kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Detecting Repetitions in Spoken Dialogue Systems Using Phonetic Distances
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.ORCID iD: 0000-0002-8773-9216
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Speech Communication and Technology.ORCID iD: 0000-0002-3323-5311
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Speech Communication and Technology.ORCID iD: 0000-0002-8579-1790
Show others and affiliations
2015 (English)In: INTERSPEECH-2015, 2015, p. 1805-1809Conference paper, Published paper (Refereed)
Abstract [en]

Repetitions in Spoken Dialogue Systems can be a symptom of problematic communication. Such repetitions are often due to speech recognition errors, which in turn makes it harder to use the output of the speech recognizer to detect repetitions. In this paper, we combine the alignment score obtained using phonetic distances with dialogue-related features to improve repetition detection. To evaluate the method proposed we compare several alignment techniques from edit distance to DTW-based distance, previously used in Spoken-Term detection tasks. We also compare two different methods to compute the phonetic distance: the first one using the phoneme sequence, and the second one using the distance between the phone posterior vectors. Two different datasets were used in this evaluation: a bus-schedule information system (in English) and a call routing system (in Swedish). The results show that approaches using phoneme distances over-perform approaches using Levenshtein distances between ASR outputs for repetition detection.

Place, publisher, year, edition, pages
2015. p. 1805-1809
National Category
Computer Sciences Natural Language Processing
Identifiers
URN: urn:nbn:se:kth:diva-180405ISI: 000380581600375Scopus ID: 2-s2.0-84959138120ISBN: 978-1-5108-1790-6 (print)OAI: oai:DiVA.org:kth-180405DiVA, id: diva2:893732
Conference
INTERSPEECH-2015, Dresden, Germany
Note

QC 20160216

Available from: 2016-01-13 Created: 2016-01-13 Last updated: 2025-02-01Bibliographically approved

Open Access in DiVA

No full text in DiVA

Scopus

Authority records

Lopes, JoséSalvi, GiampieroSkantze, GabrielGustafson, JoakimMeena, Raveesh

Search in DiVA

By author/editor
Lopes, JoséSalvi, GiampieroSkantze, GabrielGustafson, JoakimMeena, Raveesh
By organisation
Speech, Music and Hearing, TMHSpeech Communication and Technology
Computer SciencesNatural Language Processing

Search outside of DiVA

GoogleGoogle Scholar

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 312 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf