Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Voice Transformations For Improving Children's Speech Recognition In A Publicly Available Dialogue System
KTH, Tidigare Institutioner (före 2005), Tal, musik och hörsel. Telia Research AB, Sweden.ORCID-id: 0000-0002-0397-6442
KTH, Tidigare Institutioner (före 2005), Tal, musik och hörsel.
2002 (engelsk)Inngår i: Proceedings of ICSLP 02, International Speech Communication Association , 2002, s. 297-300Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

To be able to build acoustic models for children, that can beused in spoken dialogue systems, speech data has to be collected. Commercial recognizers available for Swedish are trained on adult speech, which makes them less suitable for children’s computer-directed speech. This paper describes some experiments with on-the-fly voice transformation of children’s speech. Two transformation methods were tested, one inspired by the Phase Vocoder algorithm and another by the Time-Domain Pitch-Synchronous Overlap-Add (TD-PSOLA)algorithm. The speech signal is transformed before being sent to the speech recognizer for adult speech. Our results show that this method reduces the error rates in the order of thirty to fortyfive percent for children users.

sted, utgiver, år, opplag, sider
International Speech Communication Association , 2002. s. 297-300
HSV kategori
Identifikatorer
URN: urn:nbn:se:kth:diva-13339Scopus ID: 2-s2.0-56149113752OAI: oai:DiVA.org:kth-13339DiVA, id: diva2:323753
Konferanse
7th International Conference on Spoken Language Processing (ICSLP2002 - INTERSPEECH 2002), Denver, Colorado, USA, September 16-20, 2002
Merknad

QC 20100611

Tilgjengelig fra: 2010-06-11 Laget: 2010-06-11 Sist oppdatert: 2022-06-25bibliografisk kontrollert
Inngår i avhandling
1. Developing Multimodal Spoken Dialogue Systems: Empirical Studies of Spoken Human–Computer Interaction
Åpne denne publikasjonen i ny fane eller vindu >>Developing Multimodal Spoken Dialogue Systems: Empirical Studies of Spoken Human–Computer Interaction
2002 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

This thesis presents work done during the last ten years on developing five multimodal spoken dialogue systems, and the empirical user studies that have been conducted with them. The dialogue systems have been multimodal, giving information both verbally with animated talking characters and graphically on maps and in text tables. To be able to study a wider rage of user behaviour each new system has been in a new domain and with a new set of interactional abilities. The five system presented in this thesis are: The Waxholm system where users could ask about the boat traffic in the Stockholm archipelago; the Gulan system where people could retrieve information from the Yellow pages of Stockholm; the August system which was a publicly available system where people could get information about the author Strindberg, KTH and Stockholm; the AdAptsystem that allowed users to browse apartments for sale in Stockholm and the Pixie system where users could help ananimated agent to fix things in a visionary apartment publicly available at the Telecom museum in Stockholm. Some of the dialogue systems have been used in controlled experiments in laboratory environments, while others have been placed inpublic environments where members of the general public have interacted with them. All spoken human-computer interactions have been transcribed and analyzed to increase our understanding of how people interact verbally with computers, and to obtain knowledge on how spoken dialogue systems canutilize the regularities found in these interactions. This thesis summarizes the experiences from building these five dialogue systems and presents some of the findings from the analyses of the collected dialogue corpora.

sted, utgiver, år, opplag, sider
Stockholm: KTH, 2002. s. x, 96
Serie
Trita-TMH ; 2002:8
Emneord
Spoken dialogue system, multimodal, speech, GUI, animated agents, embodied conversational characters, talking heads, empirical user studies, speech corpora, system evaluation, system development, Wizard of Oz simulations, system architecture, linguis
HSV kategori
Identifikatorer
urn:nbn:se:kth:diva-3460 (URN)
Disputas
2002-12-20, 00:00
Merknad
QC 20100611Tilgjengelig fra: 2002-12-11 Laget: 2002-12-11 Sist oppdatert: 2022-06-22bibliografisk kontrollert

Open Access i DiVA

fulltext(338 kB)288 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 338 kBChecksum SHA-512
bc30896b85e671517469242b245686c51dd91cf5ccf6fe5f668bdf925672e856f0072c65d6c2933efaa330da63355543951fffe1e25426d87f7bdbf9902846ad
Type fulltextMimetype application/pdf

Andre lenker

Scopushttps://www.isca-speech.org/archive/archive_papers/icslp_2002/i02_0297.pdf

Person

Gustafson, Joakim

Søk i DiVA

Av forfatter/redaktør
Gustafson, JoakimSjölander, Kåre
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 290 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

urn-nbn

Altmetric

urn-nbn
Totalt: 525 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf