Automatica, Vol.41, No.4, 693-700, 2005
Regressor selection with the analysis of variance method - Brief paper
Identification of non-linear dynamical models of a black box nature involves both structure decisions, i.e., which regressors to use, the selection of a regressor function, and the estimation of the parameters involved. The typical approach in system identification seems to be to mix all these steps, which for example means that the selection of regressors is based on the fits that is achieved for different choices. Alternatively one could then interpret the regressor selection as based on hypothesis tests (F-tests) at a certain confidence level that depends on the data. It would in many cases be desirable to decide which regressors to use independently of the other steps. In this paper we investigate what the well-known method of analysis of variance (ANOVA) can offer for this problem. System identification applications violate many of the ideal conditions for which ANOVA was designed and we study how the method performs under such non-ideal conditions. ANOVA is much faster than a typical parametric estimation method, using e.g. neural networks. It is actually also more reliable, in our tests, in picking the correct structure even under non-ideal conditions. One reason for this may be that ANOVA requires the data set to be balanced, that is, all parts of the regressor space are weighted equally. Just applying tests of fit for the recorded data may give, for structure identification, improper weight to areas with many, or few, samples. (c) 2004 Elsevier Ltd. All rights reserved.
Keywords:structural properties;time lag;structure identification;non-linear models;analysis of variance