Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Perceptual and Task-Oriented Assessment of a Semantic Metric for ASR Evaluation
Department of Electronic Systems, NTNU, Norway.
KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH. Department of Electronic Systems, NTNU, Norway.ORCID-id: 0000-0002-3323-5311
Department of Electronic Systems, NTNU, Norway.
2023 (engelsk)Inngår i: Interspeech 2023, International Speech Communication Association , 2023, s. 2158-2162Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Automatic speech recognition (ASR) systems have become a vital part of our everyday lives through their many applications. However, as much as we have developed in this regard, our most common evaluation method for ASR systems still remains to be word error rate (WER). WER does not give information on the severity of errors, which strongly impacts practical performance. As such, we examine a semantic-based metric called Aligned Semantic Distance (ASD) against WER and demonstrate its advantage over WER in two facets. First, we conduct a survey asking participants to score reference text and ASR transcription pairs. We perform a correlation analysis and show that ASD is more correlated to the human evaluation scores compared to WER. We also explore the feasibility of predicting human perception using ASD. Second, we demonstrate that ASD is more effective than WER as an indicator of performance on downstream NLP tasks such as named entity recognition and sentiment classification.

sted, utgiver, år, opplag, sider
International Speech Communication Association , 2023. s. 2158-2162
Emneord [en]
ASR evaluation metric, semantic context, user perception
HSV kategori
Identifikatorer
URN: urn:nbn:se:kth:diva-337837DOI: 10.21437/Interspeech.2023-1778ISI: 001186650302068Scopus ID: 2-s2.0-85171598286OAI: oai:DiVA.org:kth-337837DiVA, id: diva2:1803471
Konferanse
24th International Speech Communication Association, Interspeech 2023, August 20-24, 2023, Dublin, Ireland
Merknad

QC 20241015

Tilgjengelig fra: 2023-10-09 Laget: 2023-10-09 Sist oppdatert: 2025-02-01bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Andre lenker

Forlagets fulltekstScopus

Person

Salvi, Giampiero

Søk i DiVA

Av forfatter/redaktør
Salvi, Giampiero
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric

doi
urn-nbn
Totalt: 151 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf