Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Spectro-temporal properties of the acoustic speech signal used for speech/music discrimination
KTH, Superseded Departments, Speech, Music and Hearing.
2004 (English)Licentiate thesis, comprehensive summary (Other scientific)
Place, publisher, year, edition, pages
Stockholm: Tal musik och hörsel , 2004.
Series
Trita-TMH, ISSN 1104-5787 ; 2004:5
National Category
Musicology
Identifiers
URN: urn:nbn:se:kth:diva-501ISBN: 91-7283-827-8 OAI: oai:DiVA.org:kth-501DiVA, id: diva2:14240
Presentation
Fantum, Inst. för tal, musik och hörsel, KTH, Lindstedsvägen 24, Stockholm
Available from: 2005-11-23 Created: 2005-11-23 Last updated: 2012-03-20
List of papers
1. Discrimination between speech and music based on a low frequency modulation feature
Open this publication in new window or tab >>Discrimination between speech and music based on a low frequency modulation feature
2001 In: Proceedings of eurospeechArticle in journal (Refereed) Published
Identifiers
urn:nbn:se:kth:diva-8802 (URN)
Available from: 2005-11-23 Created: 2005-11-23Bibliographically approved
2. Expanded examinations of a low frequency modulation feature for speech/music discrimination
Open this publication in new window or tab >>Expanded examinations of a low frequency modulation feature for speech/music discrimination
2002 (English)In: PROC OF ICSLP2002, 2002Conference paper, Published paper (Refereed)
Abstract [en]

A low frequency modulation feature, LFMAD, was examined under several conditions with regard to its robustness on speech/music discrimination. The feature was tested on LF components from 2 Hz to 27 Hz and with different analysis window sizes. This feature performs best when using an analysis window size containing only one period of the LF component to be used. When the music contained much vocals, the error rate increased compared with only instrumental music in the speech/music discrimination task. This effect was found in LFMAD as well as in the MFCC feature, which was used for comparison. Tests were also carried out with signals in additive noise from 30 dB to 0 dB SNR. LFMAD performed better than MFCC in these tests. The error rate was higher for speech signals. There was a bias towards classifying data as music when the test conditions diverged from those of the training condition. This effect is less obvious for LFMAD than for MFCC. The best results in this study were obtained when combining the two features LFMAD and MFCC into a mixed feature. This seems to be a more robust feature regarding the speech/music discrimination ability and could be recommended when scanning data bases of unknown quality for speech events.

 

Identifiers
urn:nbn:se:kth:diva-8803 (URN)
Conference
ICSLP2002
Note
QC 20111007Available from: 2005-11-23 Created: 2005-11-23 Last updated: 2011-10-07Bibliographically approved
3. Speech/music discrimination using discrete hidden Markov models
Open this publication in new window or tab >>Speech/music discrimination using discrete hidden Markov models
2004 (English)Report (Other academic)
Series
TMH Quarterly Progress and Status Report ; 46
National Category
Computer Sciences Language Technology (Computational Linguistics)
Identifiers
urn:nbn:se:kth:diva-8804 (URN)
Note
QC 20111216Available from: 2005-11-23 Created: 2005-11-23 Last updated: 2018-01-13Bibliographically approved

Open Access in DiVA

No full text in DiVA

Search in DiVA

By author/editor
Karnebäck, Stefan
By organisation
Speech, Music and Hearing
Musicology

Search outside of DiVA

GoogleGoogle Scholar

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 275 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf