Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
How to annotate 100 hours in 45 minutes
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0003-1262-4876
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0001-5953-7310
KTH, Superseded Departments (pre-2005), Speech, Music and Hearing.ORCID iD: 0000-0001-9327-9482
2019 (English)In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, ISCA , 2019, p. 341-345Conference paper, Published paper (Refereed)
Abstract [en]

Speech data found in the wild hold many advantages over artificially constructed speech corpora in terms of ecological validity and cultural worth. Perhaps most importantly, there is a lot of it. However, the combination of great quantity, noisiness and variation poses a challenge for its access and processing. Generally speaking, automatic approaches to tackle the problem require good labels for training, while manual approaches require time. In this study, we provide further evidence for a semi-supervised, human-in-the-loop framework that previously has shown promising results for browsing and annotating large quantities of found audio data quickly. The findings of this study show that a 100-hour long subset of the Fearless Steps corpus can be annotated for speech activity in less than 45 minutes, a fraction of the time it would take traditional annotation methods, without a loss in performance.

Place, publisher, year, edition, pages
ISCA , 2019. p. 341-345
National Category
Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:kth:diva-268304DOI: 10.21437/Interspeech.2019-1648Scopus ID: 2-s2.0-85074718085OAI: oai:DiVA.org:kth-268304DiVA, id: diva2:1413672
Conference
Interspeech 2019 15-19 September 2019, Graz
Note

QC 20200310

Available from: 2020-03-10 Created: 2020-03-10 Last updated: 2020-05-18Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopushttps://www.isca-speech.org/archive/Interspeech_2019/abstracts/1648.html

Authority records BETA

Fallgren, PerMalisz, ZofiaEdlund, Jens

Search in DiVA

By author/editor
Fallgren, PerMalisz, ZofiaEdlund, Jens
By organisation
Speech, Music and Hearing, TMHSpeech, Music and Hearing
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 3 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf