Computers & Chemical Engineering, Vol.20, No.2, 175-186, 1996
Cross-Validated Structure Selection for Neural Networks
Model design involves generation and selection of a model that accurately describes the process, as well as reliable quality assessment of this model. Using neural networks as models simplifies the model-generation step, but aggravates the selection step due to the large number of models to be considered and the large number of parameters in each of these models. In process-engineering applications, where measured data are expensive and therefore limited, the design of neural-network models is even more difficult, because the selection step gets worse with less data. This work is concerned with the selection of a network model that best represents the process based on limited available data. Three different methods for model selection are compared on a simulation example using feedback neural networks as models. One method uses a static-split of the available data into a training set and a test set; the other method, the so-called cross-validation, uses a dynamic split of the data; and the third method uses statistical evaluation without splitting the data. The comparison reveals that, in a realistic situation of limited data size that is typical of process-engineering applications, only the cross-validation approach accurately selects the best model and reliably assesses its quality. The statistical evaluation without splitting the data yields misleading results, choosing a severely suboptimal network as the best one. None of the various possible static-splits reliably selects the best model, whereby the frequently used static-split of the available data into two approximately equal-sized sets for training and testing shows the worst performance among the various investigated static-splits. Furthermore, results show that a partially trained over-complex network model, which is often employed as a shortcut to optimal-network selection, may yield a significantly inferior process representation compared to the optimally complex network trained without premature termination.