Estimation of vocal duration in monaural mixtures
2014 (English)In: Proceedings - 40th International Computer Music Conference, ICMC 2014 and 11th Sound and Music Computing Conference, SMC 2014 - Music Technology Meets Philosophy: From Digital Echos to Virtual Ethos, National and Kapodistrian University of Athens , 2014, 1172-1177 p.Conference paper (Refereed)
In this study, the task of vocal duration estimation in monaural music mixtures is explored. We show how presently available algorithms for source separation and predominant f0 estimation can be used as a front end from which features can be extracted. A large set of features is presented, devised to connect different vocal cues to the presence of vocals. Two main cues are utilized; the voice is neither stable in pitch nor in timbre. We evaluate the performance of the model by estimating the length of the vocal regions of the mixtures. To facilitate this, a new set of annotations to a widely adopted data set is developed and made available to the community. The proposed model is able to explain about 78 % of the variance in vocal region length. In a classification task, where the excerpts are classified as either vocal or non-vocal, the model has an accuracy of about 0.94.
Place, publisher, year, edition, pages
National and Kapodistrian University of Athens , 2014. 1172-1177 p.
Algorithms, Computer music, Mixtures, Classification tasks, Data set, F0 estimations, Front end
Other Engineering and Technologies
IdentifiersURN: urn:nbn:se:kth:diva-157969ScopusID: 2-s2.0-84908895278ISBN: 978-960466137-4OAI: oai:DiVA.org:kth-157969DiVA: diva2:774125
40th International Computer Music Conference, ICMC 2014, Joint with the 11th Sound and Music Computing Conference, SMC 2014 - Music Technology Meets Philosophy: From Digital Echos to Virtual Ethos, 14 September 2014 through 20 September 2014, Athens, Greece
QC 201412222014-12-222014-12-182014-12-22Bibliographically approved