Bayesian Model Selection Helps To Choose Objectively between Thermodynamic Models: A Demonstration of Selecting a Viscosity Model Based on Entropy Scaling

Lotgering-Lin O; Schoniger A; Nowak W; Gross J

Industrial & Engineering Chemistry Research, Vol.55, No.38, 10191-10207, 2016

DOI10.1021/acs.iecr.6b02671 Export Citation

Bayesian Model Selection Helps To Choose Objectively between Thermodynamic Models: A Demonstration of Selecting a Viscosity Model Based on Entropy Scaling

Lotgering-Lin O, Schoniger A, Nowak W, Gross J

Ohjective measures to compare the: adequacy of models can be very useful to guide the development of thermodynamic models. Thermodynamicists are frequently faced with-the so-called bias-variance dilemma, where one model may be less accurate in correlating experimental data but more robust in extrapolations than another model. In this work, we use Bayesian model selection (BMS) to identify the-optimal balance between bias and variance. BMS is a statistically rigorous Procedure that elegantly accounts for experimental errors and implicitly performs a bias-variance tradeoff in model selection. We present a first-time application of BMS to thermodynamic model selection. As an example, we consider modeling approaches to predict viscosities;using Rosenfeld's entropy scaling approach [Y. Rosenfeld, Phys. Rev. A 1977, 15, 2545-.2549]. Our goal is, to objectively tank the adequacy of three competing model variants that all describe the-functional dependence of viscosity on residual entropy: the well-established linear regression approach, a recently introduced polynomial approach [O. Lotgering-Lin, J. Gross, Ind. Eng. Chem. Res. 2015, 54, 7942-7952], and a sinusoidal approach. We investigate the suitability of the models for extrapolating viscosities to different pressures and to different carbon chain lengths. Technically, we implement, in a first step, d Markov chain Monte Carlo algorithm to train the competing models on a common dataset. In a second step, we determine the statistical evidence for each model in light of an evaluation dataset with brute-force Monte Carlo integration. We call this method Of implementation 'two-step BMS". Results. show-that,. generally; both nonlinear models outperform the linear model, with the polynomial approach being much more reliable for,Carbon chain length extrapolation. However, assumptions about experimental error influence the choice of the most appropriate model candidate. Hence, we point out the benefits of applying two-step BMS based on specific datasets and specific error assumptions as a situation-specific guide to model selection.