Journal of the American Chemical Society, Vol.127, No.6, 1665-1674, 2005
Protein NMR recall, precision, and F-measure scores (RPF scores): Structure quality assessment measures based on information retrieval statistics
One of the most important challenges in modern protein NMR is the development of fast and sensitive structure quality assessment measures that can be used to evaluate the "goodness-of-fit" of the 3D structure with NOESY data, to indicate the correctness of the fold and accuracy of the resulting structure. Quality assessment is especially critical for automated NOESY interpretation and structure determination approaches. This paper describes new NMR quality assessment scores, including Recall, Precision, and F-measure scores (referred to here are "NMR RPF" scores), which quickly provide global measures of the goodness-of-fit of the 3D structures with NOESY peak lists using methods from information retrieval statistics. The sensitivity of the F-measure is improved using a scaled Fold Discriminating Power (DP) score. These statistical RPF scores are quite rapid to compute since NOE assignments and complete relaxation matrix calculations are not required. A graphical method for site-specific assessment of structure quality based on the Precision statistic is also described. These statistical measures are demonstrated to be valuable for assessing protein NMR structure accuracy. Their relationships to other proposed NMR "R-factors" and structure quality assessment scores are also discussed.