Journal of Physical Chemistry B, Vol.124, No.37, 8032-8041, 2020
Rosetta Machine Learning Models Accurately Classify Positional Effects of Thioamides on Proteolysis
Thioamide substitutions of the peptide backbone have been shown to stabilize therapeutic and imaging peptides toward proteolysis. In order to rationally design thioamide modifications, we have developed a novel Rosetta custom score function to classify thioamide positional effects on proteolysis in substrates of serine and cysteine proteases. Peptides of interest were docked into proteases using the FlexPepDock application in Rosetta. Docked complexes were modified to contain thioamides parametrized through the creation of custom atom types in Rosetta based on ab intio simulations. Thioamide complexes were simulated, and the resultant structural complexes provided features for machine learning classification as the decomposed values of the Rosetta score function. An ensemble, majority voting model was developed to be a robust predictor of previously unpublished thioamide proteolysis holdout data. Theoretical control simulations with pseudo-atoms that modulate only one physical characteristic of the thioamide show differential effects on prediction accuracy by the optimized voting classification model. These pseudo-atom model simulations, as well as statistical analyses of the full thioamide simulations, implicate steric effects on peptide binding as being primarily responsible for thioamide positional effects on proteolytic resistance.