Journal of Process Control, Vol.67, 160-175, 2018
Data mining and clustering in chemical process databases for monitoring and knowledge discovery
Modern chemical plants maintain large historical databases recording past sensor measurements which advanced process monitoring techniques analyze to help plant operators and engineers interpret the meaning of live trends in databases. However, many of the best process monitoring methods require data organized into groups before training is possible. In practice, such organization rarely exists and the time required to create classified training data is an obstacle to the use of advanced process monitoring strategies. Data mining and knowledge discovery techniques drawn from computer science literature can help engineers find fault states in historical databases and group them together with little detailed knowledge of the process. This study evaluates how several data clustering and feature extraction techniques work together to reveal useful trends in industrial chemical process data. Two studies on an industrial scale separation tower and the Tennessee Eastman process simulation demonstrate data clustering and feature extraction effectively revealing significant process trends from high dimensional, multivariate data. Process knowledge and supervised clustering metrics compare the cluster results against true labels in the data to compare performance of different combinations of dimensionality reduction and data clustering approaches. (C) 2017 Elsevier Ltd. All rights reserved.