Journal of Physical Chemistry B, Vol.116, No.10, 3331-3343, 2012
Identification of Domains in Protein Structures from the Analysis of Intramolecular Interactions
The subdivision of protein structures into smaller and independent structural domains has a fundamental importance in understanding protein evolution and function and in the development of protein classification methods as well as in the interpretation of experimental data. Due to the rapid growth in the number of solved protein structures, the need for devising new accurate algorithmic methods has become more and more urgent. In this paper, we propose a new computational approach that is based on the concept of domain as a compact and independent folding unit and on the analysis of the residue residue energy interactions obtainable through classical all-atom force field calculations. In particular, starting from the analysis of the nonbonded interaction energy matrix associated with a protein, our method filters out and selects only those specific subsets of interactions that define possible independent folding nuclei within a complex protein structure. This allows grouping different protein fragments into energy clusters that are found to correspond to structural domains. The strategy has been tested using proper benchmark data sets, and the results have shown that the new approach is fast and reliable in determining the number of domains in a totally ab initio manner and without making use of any training set or knowledge of the systems in exam. Moreover, our method, identifying the most relevant residues for the stabilization of each domain, may complement the results given by other classification techniques and may provide useful information to design and guide new experiments.