kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
DNSMOS Pro: A Reduced-Size DNN for Probabilistic MOS of Speech
Codemill AB, Umeå, Sweden.
Codemill AB, Umeå, Sweden.
Google LLC, USA.
Google LLC, USA.
Show others and affiliations
2024 (English)In: Interspeech 2024, International Speech Communication Association , 2024, p. 4818-4822Conference paper, Published paper (Refereed)
Abstract [en]

We propose a deep neural network-based architecture and training design for objective non-intrusive speech quality assessment. The proposed method builds on DNSMOS, and we call the proposed model DNSMOS Pro. DNSMOS Pro has a reduced-size architecture suitable for VoIP, a relatively simple training design using only the mean opinion score (MOS) as the target label, and predicts the posterior distribution of MOS given an input speech clip. This means DNSMOS Pro can be trained when only the MOS is reported on a subjectively rated dataset. Furthermore, we implement several non-intrusive speech quality methods and compare them to DNSMOS Pro when training and testing on different subjectively rated datasets. DNSMOS Pro has significantly better performance on these benchmark datasets compared to similar DNN-based non-intrusive speech quality methods, and competitive results to methods assuming auxiliary information in the datasets.

Place, publisher, year, edition, pages
International Speech Communication Association , 2024. p. 4818-4822
Keywords [en]
deep neural network, maximum-likelihood, Speech quality assessment, voice conversion challenge
National Category
Computer Sciences Signal Processing
Identifiers
URN: urn:nbn:se:kth:diva-358881DOI: 10.21437/Interspeech.2024-478ISI: 001331850104185Scopus ID: 2-s2.0-85214800319OAI: oai:DiVA.org:kth-358881DiVA, id: diva2:1930534
Conference
25th Interspeech Conferece 2024, Kos Island, Greece, September 1-5, 2024
Note

QC 20251021

Available from: 2025-01-23 Created: 2025-01-23 Last updated: 2025-10-21Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Chatterjee, Saikat

Search in DiVA

By author/editor
Chatterjee, Saikat
By organisation
Information Science and EngineeringDigital futures
Computer SciencesSignal Processing

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 153 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf