A New Gene Selection Algorithm using Fuzzy-Rough Set Theory for Tumor Classification

Seyedeh Faezeh Farahbakhshian, Milad Taleby Ahvanooey

Abstract


In statistics and machine learning, feature selection is the process of picking a subset of relevant attributes for utilizing in a predictive model. Recently, rough set-based feature selection techniques, that employ feature dependency to perform selection process, have been drawn attention. Classification of tumors based on gene expression is utilized to diagnose proper treatment and prognosis of the disease in bioinformatics applications. Microarray gene expression data includes superfluous feature genes of high dimensionality and smaller training instances. Since exact supervised classification of gene expression instances in such high-dimensional problems is very complex, the selection of appropriate genes is a crucial task for tumor classification. In this study, we present a new technique for gene selection using a discernibility matrix of fuzzy-rough sets. The proposed technique takes into account the similarity of those instances that have the same class labels to improve the gene selection results, while the state-of-the art previous approaches only address the similarity of instances with different class labels. To meet that requirement, we extend the Johnson reducer technique into the fuzzy case. Experimental results demonstrate that this technique provides better efficiency compared to the state-of-the-art approaches.

Keywords


tumor classification; gene selection; fuzzy-rough theory; discernibility matrix; Johnson reducer

Full Text: PDF