kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Latent-based Neural Net for Non-intrusive Speech Quality Assessment
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.
Google LLC, Stockholm, Sweden.
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Information Science and Engineering. Digital Futures, Stockholm, Sweden.ORCID iD: 0000-0003-2638-6047
2023 (English)In: 31st European Signal Processing Conference, EUSIPCO 2023 - Proceedings, European Signal Processing Conference, EUSIPCO , 2023, p. 226-230Conference paper, Published paper (Refereed)
Abstract [en]

For non-intrusive speech quality assessment, we treat the mean-opinion-score (MOS) of a speech signal as a latent, and propose a latent MOS network (LaMOSNet) to estimate the MOS. At the time of training, the proposed LaMOSNet has two parts in series, with the first part providing the latent estimate, i.e. the MOS of an input speech signal, and the second part providing an estimated score by a given judge. Only the first part is used for testing. We address two inherent aspects - limited-data and noisy-data aspects - in training using stochastic gradient noise and a student-teacher type of training, motivated by semi-supervised learning. It is shown that LaMOSNet provides good performance on the Voice Conversion Challenge 2018 dataset, and state-of-the-art correlation performance on the Voice Conversion Challenge 2016 dataset.

Place, publisher, year, edition, pages
European Signal Processing Conference, EUSIPCO , 2023. p. 226-230
National Category
Signal Processing
Identifiers
URN: urn:nbn:se:kth:diva-340802DOI: 10.23919/EUSIPCO58844.2023.10289840Scopus ID: 2-s2.0-85178344523OAI: oai:DiVA.org:kth-340802DiVA, id: diva2:1819618
Conference
31st European Signal Processing Conference, EUSIPCO 2023, Helsinki, Finland, Sep 4 2023 - Sep 8 2023
Note

Part of ISBN 9789464593600

QC 20231214

Available from: 2023-12-14 Created: 2023-12-14 Last updated: 2023-12-14Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Cumlin, FredrikChatterjee, Saikat

Search in DiVA

By author/editor
Cumlin, FredrikChatterjee, Saikat
By organisation
Robotics, Perception and Learning, RPLInformation Science and Engineering
Signal Processing

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 19 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf