Covariance structure approximation via gLasso in high-dimensional supervised classification
2012 (English)In: Journal of Applied Statistics, ISSN 0266-4763, E-ISSN 1360-0532, Vol. 39, no 8, 1643-1666 p.Article in journal (Refereed) Published
Recent work has shown that the Lasso-based regularization is very useful for estimating the high-dimensional inverse covariance matrix. A particularly useful scheme is based on penalizing the l(1) norm of the off-diagonal elements to encourage sparsity. We embed this type of regularization into high-dimensional classification. A two-stage estimation procedure is proposed which first recovers structural zeros of the inverse covariance matrix and then enforces block sparsity by moving non-zeros closer to the main diagonal. We show that the block-diagonal approximation of the inverse covariance matrix leads to an additive classifier, and demonstrate that accounting for the structure can yield better performance accuracy. Effect of the block size on classification is explored, and a class of as ymptotically equivalent structure approximations in a high-dimensional setting is specified. We suggest a variable selection at the block level and investigate properties of this procedure in growing dimension asymptotics. We present a consistency result on the feature selection procedure, establish asymptotic lower an upper bounds for the fraction of separative blocks and specify constraints under which the reliable classification with block-wise feature selection can be performed. The relevance and benefits of the proposed approach are illustrated on both simulated and real data.
Place, publisher, year, edition, pages
2012. Vol. 39, no 8, 1643-1666 p.
high dimensionality; classification accuracy; sparsity; block-diagonal covariance structure; graphical Lasso; separation strength
Probability Theory and Statistics
IdentifiersURN: urn:nbn:se:kth:diva-75454DOI: 10.1080/02664763.2012.663346ISI: 000305486300002ScopusID: 2-s2.0-84862594316OAI: oai:DiVA.org:kth-75454DiVA: diva2:490500
FunderSwedish Research Council, 421-2008-1966
QC 201207172012-02-052012-02-052012-07-17Bibliographically approved