Electrophoresis, Vol.31, No.14, 2311-2318, 2010
Prediction of metabolite identity from accurate mass, migration time prediction and isotopic pattern information in CE-TOFMS data
CE-TOFMS is a powerful method for profiling charged metabolites. However, the limited availability of metabolite standards hinders the process of identifying compounds from detected features in CE-TOFMS data sets. To overcome this problem, we developed a method to identify unknown peaks based on the predicted migration time (t(m)) and accurate m/z values. We developed a predictive model using 375 standard cationic metabolites and support vector regression. The model yielded good correlations between the predicted and measured t(m) (R = 0.952 and 0.905 using complete and cross-validation data sets, respectively). Using the trained model, we subsequently predicted the t(m) for 2938 metabolites available from the public databases and assigned tentative identities to noise-filtered features in human urine samples. While 38.9% of the peaks were assigned metabolite names by matching with the standard library alone, the proportion increased to 52.2%. The proposed methodology increases the value of metabolomic data sets obtained from CE-TOFMS profiling.