kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
SawtArabi: A Benchmark Corpus for Arabic TTS. Standard, Dialectal and Code-Switching
Humain, Saudi Arabia.
Saudi Data AI Authority, Saudi Arabia.
Humain, Saudi Arabia.
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0002-1886-681X
Show others and affiliations
2025 (English)In: Interspeech 2025, International Speech Communication Association , 2025, p. 4793-4797Conference paper, Published paper (Refereed)
Abstract [en]

Curating Text-to-Speech (TTS) datasets is a strenuous task given the quality considerations. While it is hard to find high-quality TTS datasets in languages other than English, it is rare to come across code-switching (CS) datasets. As a part of this work, we curate a 4-hour Arabic-English TTS corpus consisting of code-switched Egyptian-English, monolingual Modern Standard Arabic (MSA), Egyptian, and English, all recorded by the same voice talent. We demonstrate the importance of vowelization and the need for better phonemization of Arabic text. To this effect, we present the modified espeak-ng phonemizer that handles various irregularities of espeak-ng over Arabic text. Upon training baseline TTS systems over this benchmark, we demonstrate its efficacy through extensive subjective evaluations.

Place, publisher, year, edition, pages
International Speech Communication Association , 2025. p. 4793-4797
Keywords [en]
Code-switching, Dialectal Speech, Multilingual, Phonemization, Text-to-Speech Synthesis
National Category
Natural Language Processing Studies of Specific Languages Comparative Language Studies and Linguistics
Identifiers
URN: urn:nbn:se:kth:diva-372803DOI: 10.21437/Interspeech.2025-2573Scopus ID: 2-s2.0-105020056289OAI: oai:DiVA.org:kth-372803DiVA, id: diva2:2013492
Conference
26th Interspeech Conference 2025, Rotterdam, Netherlands, Kingdom of the, August 17-21, 2025
Note

QC 20251113

Available from: 2025-11-13 Created: 2025-11-13 Last updated: 2025-11-13Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Mehta, ShivamHenter, Gustav Eje

Search in DiVA

By author/editor
Mehta, ShivamHenter, Gustav Eje
By organisation
Speech, Music and Hearing, TMH
Natural Language ProcessingStudies of Specific LanguagesComparative Language Studies and Linguistics

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 92 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf