Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Prediction of three articulatory categories in vocal sound imitations using models for auditory receptive fields
KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH.ORCID-id: 0000-0003-2926-6518
KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST).ORCID-id: 0000-0002-9081-2170
KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH.
KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH.ORCID-id: 0000-0002-3511-023X
Vise andre og tillknytning
2018 (engelsk)Inngår i: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 144, nr 3, s. 1467-1483Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Vocal sound imitations provide a new challenge for understanding the coupling between articulatory mechanisms and the resulting audio. In this study, we have modeled the classification of three articulatory categories, phonation, supraglottal myoelastic vibrations, and turbulence from audio recordings. Two data sets were assembled, consisting of different vocal imitations by four professional imitators and four non-professional speakers in two different experiments. The audio data were manually annotated by two experienced phoneticians using a detailed articulatory description scheme. A separate set of audio features was developed specifically for each category using both time-domain and spectral methods. For all time-frequency transformations, and for some secondary processing, the recently developed Auditory Receptive Fields Toolbox was used. Three different machine learning methods were applied for predicting the final articulatory categories. The result with the best generalization was found using an ensemble of multilayer perceptrons. The cross-validated classification accuracy was 96.8 % for phonation, 90.8 % for supraglottal myoelastic vibrations, and 89.0 % for turbulence using all the 84 developed features. A final feature reduction to 22 features yielded similar results.

sted, utgiver, år, opplag, sider
Acoustical Society of America (ASA), 2018. Vol. 144, nr 3, s. 1467-1483
Emneord [en]
vocal articulation, sound imitations, signal processing, auditory receptive fields, turbulence, phonation, supraglottal myoelastic vibration, partial least-square regression, support vector classification, ensemble learning
HSV kategori
Forskningsprogram
Tal- och musikkommunikation
Identifikatorer
URN: urn:nbn:se:kth:diva-234295DOI: 10.1121/1.5052438ISI: 000457802200049PubMedID: 30424637Scopus ID: 2-s2.0-85053873907OAI: oai:DiVA.org:kth-234295DiVA, id: diva2:1245861
Forskningsfinansiär
EU, FP7, Seventh Framework Programme, 618067
Merknad

QC 20181003

Tilgjengelig fra: 2018-09-06 Laget: 2018-09-06 Sist oppdatert: 2023-12-05bibliografisk kontrollert

Open Access i DiVA

fulltext(3327 kB)304 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 3327 kBChecksum SHA-512
1a723ef979175310f3697790e5cd2507499aae4dd63b790c0cdb90f81fd774fe2b3d6bda464b1e9933f1c0fb22ec0a4dc44f7c2c7e4839755595c0d808bb79c7
Type fulltextMimetype application/pdf

Andre lenker

Forlagets fulltekstPubMedScopushttps://doi.org/10.1121/1.5052438

Person

Friberg, AndersLindeberg, TonyHellwagner, MartinHelgason, PéturSalomão, Gláucia LaísElowsson, AndersTernström, Sten

Søk i DiVA

Av forfatter/redaktør
Friberg, AndersLindeberg, TonyHellwagner, MartinHelgason, PéturSalomão, Gláucia LaísElowsson, AndersLemaitre, GuillaumeTernström, Sten
Av organisasjonen
I samme tidsskrift
Journal of the Acoustical Society of America

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 304 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

doi
pubmed
urn-nbn

Altmetric

doi
pubmed
urn-nbn
Totalt: 845 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf