kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Continued finetuning as single speaker adaptation
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.
2022 (English)In: TMH QPSR, Stockholm, 2022, Vol. 3Conference paper, Oral presentation with published abstract (Other academic)
Abstract [en]

The adaptation of unsupervised learning techniques to speech recognition have enabled the training of accurate models with less labelled training data, by finetuning a supervised classifier on top of a network pretrained using self-supervised methods. In this paper, we investigate if continuing the fine-tuning of such a model is suitable as a method of speaker adaptation for a single speaker, considering two kinds of user: the casual user, with data measurable in minutes, and the professional user, with data measurable in hours. We conduct experiments across a range of dataset sizes, in an attempt to provide a basis for estimates on how much data would be needed.

Place, publisher, year, edition, pages
Stockholm, 2022. Vol. 3
Keywords [en]
speaker adaptation, finetuning, automatic speech recognition
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Research subject
Speech and Music Communication
Identifiers
URN: urn:nbn:se:kth:diva-314269OAI: oai:DiVA.org:kth-314269DiVA, id: diva2:1671614
Conference
Fonetik 2022 - the XXXIIIrd Swedish Phonetics Conference
Note

QC 20220812

Available from: 2022-06-17 Created: 2022-06-17 Last updated: 2022-08-12Bibliographically approved

Open Access in DiVA

fulltext(309 kB)135 downloads
File information
File name FULLTEXT01.pdfFile size 309 kBChecksum SHA-512
caf196403b4f097d9480e26b06aafee126ec009982fa81f6ebca665a4ff67cf82bc5261ab7651bce8343f8b02d49fc3c7df29f3b56b561ed8677b46379616e08
Type fulltextMimetype application/pdf

Other links

Conference

Authority records

O'Regan, Jim

Search in DiVA

By author/editor
O'Regan, Jim
By organisation
Speech, Music and Hearing, TMH
Other Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 135 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 418 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf