Computers & Chemical Engineering, Vol.121, 99-110, 2019
Wide spectrum feature selection (WiSe) for regression model building
Developing predictive models from industrial datasets implies the consideration of many possible predictor variables (features). Using all available features for data-driven modelling is not recommended, as most of them are expected to be irrelevant and their inclusion in the model may compromise robustness and accuracy. In this work, we present, test and compare a new two-stage feature selection method called wide spectrum feature selection for regression (WiSe). In the first stage, a combination of efficient bivariate filters analyzes linear and non-linear association patterns between predictors and responses, screening out clearly noisy features. In the second stage, the reduced set of retained features is subject to further selection in the scope of the predictive methods considered, optimizing their predictive performance. Three simulated datasets and an industrial case illustrate the effectiveness and benefits of applying WiSe to support model development in a wide range of high-dimensional regression problems. (C) 2018 Elsevier Ltd. All rights reserved.
Keywords:Feature selection;Filtering methods;Predictive analytics;Effect sparsity;Symmetrical uncertainty