Word Discovery with Beta Process Factor Analysis
2012 (English)In: 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012, Vol 1, 2012, 798-801 p.Conference paper (Refereed)
We propose the application of a recently developed non-parametric Bayesian method for factor analysis to the problem of word discovery from continuous speech. The method, based on Beta Process priors, has a number of advantages compared to previously proposed methods, such as Non-negative Matrix Factorisation (NMF). Beta Process Factor Analysis (BPFA) is able to estimate the size of the basis, and therefore the number of recurring patterns, or word candidates, found in the data. We compare the results obtained with BPFA and NMF on the TIDigits database, showing that our method is capable of not only finding the correct words, but also the correct number of words. We also show that the method can infer the approximate number of words for different vocabulary sizes by testing on randomly generated sequences of words.
Place, publisher, year, edition, pages
2012. 798-801 p.
word discovery, beta process factor analysis, Bayesian nonparametric method, non-negative matrix factorisation
Computer Science Language Technology (Computational Linguistics)
IdentifiersURN: urn:nbn:se:kth:diva-109367ISI: 000320827200200ScopusID: 2-s2.0-84878394711ISBN: 978-1-62276-759-5OAI: oai:DiVA.org:kth-109367DiVA: diva2:581750
13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012; Portland, OR; United States; 9 September 2012 through 13 September 2012
QC 201308232013-01-022013-01-022013-08-23Bibliographically approved