Energy & Fuels, Vol.21, No.6, 3406-3409, 2007
Classification of gasoline grades using compositional data and expectation-maximization algorithm
This work demonstrates the application of an expectation-maximization (EM) algorithm in classifying gasoline samples belonging to different commercial grades based on gas chromatography (GC) and gas chromatography-mass spectrometry (GC-MS) compositional data. The classification process was based on an "optimal" subset of compositional variables, which were identified by means of a variable reduction method that maintained a multivariate data structure. The EM algorithm was then applied on this variable subset to determine the Gaussian model parameters that best described the data. Initially, an evaluation of the methodology was carried out on published GC-MS data of 88 Canadian gasoline samples, and the results from our study were compared to the results that were already presented in past literature. The methodology was subsequently tested on GC data from 74 Greek gasoline samples analyzed in our laboratory. The conjunction of variable reduction with the EM algorithm has proven to be a successful and reliable classification tool for gasoline samples belonging to different commercial grades (premium, regular, winter, and summer) in both data sets.