kth.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
An Analysis of Goodness of Pronunciation for Child Speech
Department of Electronic Systems, NTNU, Norway.
Department of Electronic Systems, NTNU, Norway.
Department of Electronic Systems, NTNU, Norway.
KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH. Department of Electronic Systems, NTNU, Norway.ORCID-id: 0000-0002-3323-5311
2023 (Engelska)Ingår i: Interspeech 2023, International Speech Communication Association , 2023, s. 4613-4617Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

In this paper, we study the use of goodness of pronunciation (GOP) on child speech. We first compare the distributions of GOP scores on several open datasets representing various dimensions of speech variability. We show that the GOP distribution over CMU Kids, corresponding to young age, has larger spread than those on datasets representing other dimensions, i.e., accent, dialect, spontaneity and environmental conditions. We hypothesize that the increased variability of pronunciation in young age may impair the use of traditional mispronunciation detection methods for children. To support this hypothesis, we perform simulated mispronunciation experiments both for children and adults using different variants of the GOP algorithm. We also compare the results to real-case mispronunciations for native children showing that GOP is less effective for child speech than for adult speech.

Ort, förlag, år, upplaga, sidor
International Speech Communication Association , 2023. s. 4613-4617
Nyckelord [en]
ASR, child speech, data scarcity, GOP, mispronunciation detection and diagnosis, speech assessment
Nationell ämneskategori
Datavetenskap (datalogi) Språkteknologi (språkvetenskaplig databehandling)
Identifikatorer
URN: urn:nbn:se:kth:diva-337872DOI: 10.21437/Interspeech.2023-743Scopus ID: 2-s2.0-85171580096OAI: oai:DiVA.org:kth-337872DiVA, id: diva2:1803868
Konferens
24th International Speech Communication Association, Interspeech 2023, Dublin, Ireland, Aug 20 2023 - Aug 24 2023
Anmärkning

QC 20231010

Tillgänglig från: 2023-10-10 Skapad: 2023-10-10 Senast uppdaterad: 2023-10-10Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextScopus

Person

Salvi, Giampiero

Sök vidare i DiVA

Av författaren/redaktören
Salvi, Giampiero
Av organisationen
Tal, musik och hörsel, TMH
Datavetenskap (datalogi)Språkteknologi (språkvetenskaplig databehandling)

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetricpoäng

doi
urn-nbn
Totalt: 285 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf