Computers & Chemical Engineering, Vol.20, No.9, 1133-1140, 1996
Modified Quasi-Newton Methods for Training Neural Networks
The backpropagation algorithm is the most popular procedure to train self-learning feedforward neural networks. However, the convergence of this algorithm is slow, it being mainly a steepest descent method. Several researchers have proposed other approaches to improve the convergence : conjugate gradient methods, dynamic modification of learning parameters, quasi-Newton or Newton methods, stochastic methods, etc. Quasi-Newton methods were criticized because they require significant computation time and memory space to perform the update of the Hessian matrix limiting their use to middle-sized problems. This paper proposes three variations of the classical approach of the quasi-Newton method that take into account the structure of the network. By neglecting some second-order interactions, the sizes of the resulting approximated Hessian matrices are not proportional to the square of the total number of weights in the network but depend on the number of neurons of each level. The modified quasi-Newton methods are tested on two examples and are compared to classical approaches like regular quasi-Newton methods, backpropagation and conjugate gradient methods. The numerical results show that one of these approaches, named BFGS-N, represents a clear gain in terms of computational time, on large-scale problems, over the traditional methods without the requirement of large memory space.