Le Thi Hoai An, Le Hoai Minh, Nguyen Van Vinh, Pham Dinh Tao: "A DC Programming approach for Feature Selection in Support Vector Machines learning".

Abstract: Feature selection consists of choosing a subset of available features that capture the relevant properties of the data. In supervised pattern classification, a good choice of features is fundamental for building compact and accurate classifiers. In this paper, we develop an efficient feature selection method using the zero-norm l 0 in the context of support vector machines (SVMs). Discontinuity at the origin for l 0 makes the solution of the corresponding optimization problem difficult to solve. To overcome this drawback, we use a robust DC (difference of convex functions) programming approach which is a general framework for non-convex continuous optimisation. We consider an appropriate continuous approximation to l 0 such that the resulting problem can be formulated as a DC program. Our DC algorithm (DCA) has a finite convergence and requires solving one linear program at each iteration. Computational experiments on standard datasets including challenging feature-selection problems of the NIPS 2003 feature selection challenge and gene selection for cancer classification show that the proposed method is promising: while it suppresses up to more than 99% of the features, it can provide a good classification. Moreover, the comparative results illustrate the superiority of the proposed approach over standard methods such as classical SVMs and feature selection concave.

 

Keywords: Feature selection, SVM, Nonconvex optimisation, DC programming, DCA.

 

Citation: Le Thi Hoai An, Le Hoai Minh, Nguyen Van Vinh, Pham Dinh Tao, A DC Programming approach for Feature Selection in Support Vector Machines learning, Journal of Advances in Data Analysis and Classification, Vol 2 Num 3, pp. 259-278, 2008.

 

Download link