Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Expanded examinations of a low frequency modulation feature for speech/music discrimination
KTH, School of Computer Science and Communication (CSC), Speech, Music and Hearing, TMH.
2002 (English)In: PROC OF ICSLP2002, 2002Conference paper, Published paper (Refereed)
Abstract [en]

A low frequency modulation feature, LFMAD, was examined under several conditions with regard to its robustness on speech/music discrimination. The feature was tested on LF components from 2 Hz to 27 Hz and with different analysis window sizes. This feature performs best when using an analysis window size containing only one period of the LF component to be used. When the music contained much vocals, the error rate increased compared with only instrumental music in the speech/music discrimination task. This effect was found in LFMAD as well as in the MFCC feature, which was used for comparison. Tests were also carried out with signals in additive noise from 30 dB to 0 dB SNR. LFMAD performed better than MFCC in these tests. The error rate was higher for speech signals. There was a bias towards classifying data as music when the test conditions diverged from those of the training condition. This effect is less obvious for LFMAD than for MFCC. The best results in this study were obtained when combining the two features LFMAD and MFCC into a mixed feature. This seems to be a more robust feature regarding the speech/music discrimination ability and could be recommended when scanning data bases of unknown quality for speech events.

 

Place, publisher, year, edition, pages
2002.
Identifiers
URN: urn:nbn:se:kth:diva-8803OAI: oai:DiVA.org:kth-8803DiVA: diva2:14238
Conference
ICSLP2002
Note
QC 20111007Available from: 2005-11-23 Created: 2005-11-23 Last updated: 2011-10-07Bibliographically approved
In thesis
1. Spectro-temporal properties of the acoustic speech signal used for speech/music discrimination
Open this publication in new window or tab >>Spectro-temporal properties of the acoustic speech signal used for speech/music discrimination
2004 (English)Licentiate thesis, comprehensive summary (Other scientific)
Place, publisher, year, edition, pages
Stockholm: Tal musik och hörsel, 2004
Series
Trita-TMH, ISSN 1104-5787 ; 2004:5
National Category
Musicology
Identifiers
urn:nbn:se:kth:diva-501 (URN)91-7283-827-8 (ISBN)
Presentation
Fantum, Inst. för tal, musik och hörsel, KTH, Lindstedsvägen 24, Stockholm
Available from: 2005-11-23 Created: 2005-11-23 Last updated: 2012-03-20

Open Access in DiVA

No full text

Other links

Fulltext

Search in DiVA

By author/editor
Karnebäck, Stefan
By organisation
Speech, Music and Hearing, TMH

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 22 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf