Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Detecting a targeted voice style in an audiobook using voice quality features
Show others and affiliations
2012 (English)In: Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on, 2012, 4593-4596 p.Conference paper, Published paper (Refereed)
Resource type
Text
Abstract [en]

Audiobooks are known to contain a variety of expressive speaking styles that occur as a result of the narrator mimicking a character in a story, or expressing affect. An accurate modeling of this variety is essential for the purposes of speech synthesis from an audiobook. Voice quality differences are important features characterizing these different speaking styles, which are realized on a gradient and are often difficult to predict from the text. The present study uses a pa- rameter characterizing breathy to tense voice qualities using features of the wavelet transform, and a measure for identifying creaky seg- ments in an utterance. Based on these features, a combination of supervised and unsupervised classification is used to detect the re- gions in an audiobook, where the speaker changes his regular voice quality to a particular voice style. The target voice style candidates are selected based on the agreement of the supervised classifier en- semble output, and evaluated in a listening test. 

Place, publisher, year, edition, pages
2012. 4593-4596 p.
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:kth:diva-185520DOI: 10.1109/ICASSP.2012.6288941ISI: 000312381404166Scopus ID: 2-s2.0-84867584684ISBN: 978-1-4673-0046-9 (print)OAI: oai:DiVA.org:kth-185520DiVA: diva2:922794
Conference
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP. MAR 25-30, 2012.
Note

QC 20160426

Available from: 2016-04-25 Created: 2016-04-21 Last updated: 2016-06-02Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Székely, Éva
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar

Altmetric score

Total: 10 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf