Gene selection in time-series gene expression data
2011 (English)In: 6th IAPR International Conference on Pattern Recognition in Bioinformatics, PRIB 2011, 2011, 145-156 p.Conference paper (Refereed)
The dimensionality of biological data is often very high. Feature selection can be used to tackle the problem of high dimensionality. However, majority of the work in feature selection consists of supervised feature selection methods which require class labels. The problem further escalates when the data is time-series gene expression measurements that measure the effect of external stimuli on biological system. In this paper we propose an unsupervised method for gene selection from time-series gene expression data founded on statistical significance testing and swap randomization. We perform experiments with a publicly available mouse gene expression dataset and also a human gene expression dataset describing the exposure to asbestos. The results in both datasets show a considerable decrease in number of genes.
Place, publisher, year, edition, pages
2011. 145-156 p.
, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), ISSN 03029743
Feature Selection, Randomization, Statistical Significance, Time-series, Biological data, Class labels, Data sets, Expression measurements, External stimulus, Feature selection methods, Gene selection, High dimensionality, Time-series gene expression data, Unsupervised method, Bioinformatics, Feature extraction, Mammals, Gene expression
Bioinformatics and Systems Biology
IdentifiersURN: urn:nbn:se:kth:diva-150681DOI: 10.1007/978-3-642-24855-9_13ISI: 000308508800013ScopusID: 2-s2.0-80455140577ISBN: 9783642248542OAI: oai:DiVA.org:kth-150681DiVA: diva2:744920
2-4 November 2011, Delft, Netherlands
QC 201409092014-09-092014-09-082014-09-09Bibliographically approved