Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Syllabification of conversational speech using bidirectional long-short-term memory neural networks
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Speech Communication and Technology.ORCID iD: 0000-0001-9327-9482
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH, Speech Communication and Technology.
Show others and affiliations
2011 (English)In: Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on, Prague, Czech Republic, 2011, 5256-5259 p.Conference paper, Published paper (Refereed)
Abstract [en]

Segmentation of speech signals is a crucial task in many types of speech analysis. We present a novel approach at segmentation on a syllable level, using a Bidirectional Long-Short-Term Memory Neural Network. It performs estimation of syllable nucleus positions based on regression of perceptually motivated input features to a smooth target function. Peak selection is performed to attain valid nuclei positions. Performance of the model is evaluated on the levels of both syllables and the vowel segments making up the syllable nuclei. The general applicability of the approach is illustrated by good results for two common databases - Switchboard and TIMIT - for both read and spontaneous speech, and a favourable comparison with other published results.

Place, publisher, year, edition, pages
Prague, Czech Republic, 2011. 5256-5259 p.
Series
International Conference on Acoustics Speech and Signal Processing ICASSP, ISSN 1520-6149
National Category
Computer Science Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:kth:diva-52175DOI: 10.1109/ICASSP.2011.5947543ISI: 000296062405216Scopus ID: 2-s2.0-80051628297OAI: oai:DiVA.org:kth-52175DiVA: diva2:465470
Conference
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Note
tmh_import_11_12_14. QC 20111228Available from: 2011-12-14 Created: 2011-12-14 Last updated: 2011-12-28Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Authority records BETA

Edlund, Jens

Search in DiVA

By author/editor
Edlund, JensNeiberg, Daniel
By organisation
Speech Communication and Technology
Computer ScienceLanguage Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 32 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf