Properties of Datasets Predict the Performance of Classifiers
2013 (English)In: BMVC 2013 - Electronic Proceedings of the British Machine Vision Conference 2013, British Machine Vision Association, BMVA , 2013Conference paper (Refereed)
It has been shown that the performance of classifiers depends not only on the number of training samples, but also on the quality of the training set [10, 12]. The purpose of this paper is to 1) provide quantitative measures that determine the quality of the training set and 2) provide the relation between the test performance and the proposed measures. The measures are derived from pairwise affinities between training exemplars of the positive class and they have a generative nature. We show that the performance of the state of the art methods, on the test set, can be reasonably predicted based on the values of the proposed measures on the training set. These measures open up a wide range of applications to the recognition community enabling us to analyze the behavior of the learning algorithms w.r.t the properties of the training data. This will in turn enable us to devise rules for the automatic selection of training data that maximize the quantified quality of the training set and thereby improve recognition performance.
Place, publisher, year, edition, pages
British Machine Vision Association, BMVA , 2013.
Computer vision, Automatic selection, Performance of classifier, Quantitative measures, State-of-the-art methods, Test performance, Training data, Training sample, Training sets
Computer Vision and Robotics (Autonomous Systems)
IdentifiersURN: urn:nbn:se:kth:diva-129602DOI: 10.5244/C.27.44ISI: 000346352700041ScopusID: 2-s2.0-84898453388OAI: oai:DiVA.org:kth-129602DiVA: diva2:653097
2013 24th British Machine Vision Conference, BMVC 2013; Bristol; United Kingdom; 9 September 2013 through 13 September 2013
QC 201406022013-10-022013-10-022015-10-06Bibliographically approved