TY - JOUR
T1 - DWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm
AU - Soufan, Othman
AU - Kleftogiannis, Dimitrios A.
AU - Kalnis, Panos
AU - Bajic, Vladimir B.
N1 - KAUST Repository Item: Exported on 2020-10-01
PY - 2015/2/26
Y1 - 2015/2/26
N2 - Many scientific problems can be formulated as classification tasks. Data that harbor relevant information are usually described by a large number of features. Frequently, many of these features are irrelevant for the class prediction. The efficient implementation of classification models requires identification of suitable combinations of features. The smaller number of features reduces the problem's dimensionality and may result in higher classification performance. We developed DWFS, a web-based tool that allows for efficient selection of features for a variety of problems. DWFS follows the wrapper paradigm and applies a search strategy based on Genetic Algorithms (GAs). A parallel GA implementation examines and evaluates simultaneously large number of candidate collections of features. DWFS also integrates various filteringmethods thatmay be applied as a pre-processing step in the feature selection process. Furthermore, weights and parameters in the fitness function of GA can be adjusted according to the application requirements. Experiments using heterogeneous datasets from different biomedical applications demonstrate that DWFS is fast and leads to a significant reduction of the number of features without sacrificing performance as compared to several widely used existing methods. DWFS can be accessed online at www.cbrc.kaust.edu.sa/dwfs.
AB - Many scientific problems can be formulated as classification tasks. Data that harbor relevant information are usually described by a large number of features. Frequently, many of these features are irrelevant for the class prediction. The efficient implementation of classification models requires identification of suitable combinations of features. The smaller number of features reduces the problem's dimensionality and may result in higher classification performance. We developed DWFS, a web-based tool that allows for efficient selection of features for a variety of problems. DWFS follows the wrapper paradigm and applies a search strategy based on Genetic Algorithms (GAs). A parallel GA implementation examines and evaluates simultaneously large number of candidate collections of features. DWFS also integrates various filteringmethods thatmay be applied as a pre-processing step in the feature selection process. Furthermore, weights and parameters in the fitness function of GA can be adjusted according to the application requirements. Experiments using heterogeneous datasets from different biomedical applications demonstrate that DWFS is fast and leads to a significant reduction of the number of features without sacrificing performance as compared to several widely used existing methods. DWFS can be accessed online at www.cbrc.kaust.edu.sa/dwfs.
UR - http://hdl.handle.net/10754/346688
UR - http://dx.plos.org/10.1371/journal.pone.0117988
UR - http://www.scopus.com/inward/record.url?scp=84923828257&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0117988
DO - 10.1371/journal.pone.0117988
M3 - Article
C2 - 25719748
SN - 1932-6203
VL - 10
SP - e0117988
JO - PLoS ONE
JF - PLoS ONE
IS - 2
ER -