Industrial & Engineering Chemistry Research, Vol.59, No.44, 19623-19632, 2020
Process Data Visualization Using Bikernel t-Distributed Stochastic Neighbor Embedding
A visualization approach for process fault detection using bikernel t-distributed stochastic neighbor embedding (bikernel t-SNE) is described in this paper. Bikernel t-SNE preserves the dimension-reduction ability of the basic t-SNE and enables explicit out-of-sample extensions. First, the explicit projections between high-dimensional data and low-dimensional features are approximated by a linear combination of their Gaussian functions. Second, the feature kernel matrix is transformed into two latent variables through entropy component analysis (ECA). With the bikernel mapping and ECA, outliers are depicted away from the inliers in two-dimensional (2D) scatter plots. Then, the squared Mahalanobis distance is used as the fault detection index, whose confidence boundary is represented as an ellipse in the 2D map. The performance of the proposed method is demonstrated through two case studies on the Tennessee Eastman process as well as a real recombinant human granulocyte colony-stimulating factor production process.