kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Towards a Swedish test set for speech-oriented text normalisation
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0002-9659-1532
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0001-9327-9482
2022 (English)Conference paper, Published paper (Other academic)
Abstract [en]

Text-to-speech synthesis (TTS) can be split into two steps: the preprocessor, which takes input text, including its encoding and formatting, and turns it into a representation that is accepted by the synthesizer, which in turn converts this representation into an acoustic waveform representing speech. TTS is commonly evaluated in terms of how intelligible or humanlike the speech is, where different synthesizers working on the same input representation are regularly compared, whereas the preprocessing is habitually ignored in TTS evaluation. Were we to evaluate preprocessing, we could evaluate it as a whole (e.g. compare its output for some input representation to a target phonemic representation) or as individual processes such as sentence detection, tokenisation, text normalisation (TN) and pronunciation generation.This paper focuses on the evaluation of speech-oriented text normalisation (STN), that is the conversion of the input text into an expanded string of the words to be spoken, for example expansions of. abbreviations and different types of numerals. It is a request for comments for the creation of a test set for the evaluation of Swedish STN, which can be used as a baseline for future STN models, and as part of the overall evaluation of Swedish speech-oriented preprocessing.

Place, publisher, year, edition, pages
Göteborg: Göteborgs universitet, 2022.
Keywords [en]
speech-oriented text processing, test set
National Category
Natural Language Processing
Research subject
Speech and Music Communication
Identifiers
URN: urn:nbn:se:kth:diva-323669OAI: oai:DiVA.org:kth-323669DiVA, id: diva2:1735465
Conference
Swedish Language Technology Conference (SLTC),November 18-20 2020, Göteborg
Funder
Vinnova, 2018-02427
Note

QC 20230215

Available from: 2023-02-08 Created: 2023-02-08 Last updated: 2025-02-07Bibliographically approved

Open Access in DiVA

fulltext(318 kB)76 downloads
File information
File name FULLTEXT01.pdfFile size 318 kBChecksum SHA-512
da512dec938e62d054cc3df31a392d886e2019bdfc7b2350fd1bce38a676c367d3d5feb5d27a454ba14676bad570f76a512c56eb2e7fea1d6d9155a14df3aa13
Type fulltextMimetype application/pdf

Other links

Conference website

Authority records

Tånnander, ChristinaEdlund, Jens

Search in DiVA

By author/editor
Tånnander, ChristinaEdlund, Jens
By organisation
Speech, Music and Hearing, TMH
Natural Language Processing

Search outside of DiVA

GoogleGoogle Scholar
Total: 77 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 157 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf