When: January 29, at 1-2 pm
Where: Room B705, Department of Statistics, Stockholm University


Bayesian model comparison can be based on the posterior distribution over the set of compared models. This distribution is often observed to concentrate on a single model even when other measures of model fit or forecasting ability indicate no strong preference. Furthermore, a moderate change in the data sample can easily shift the posterior model probabilities to concentrate on another model. To shed more light on the sources of this overconfidence we derive the sampling variance of the Bayes factor in linear regression. The results show that overconfidence is likely to happen when i) the compared models give very different approximations of the data-generating process, and when ii) the models are very flexible with large degrees of freedom that are not shared between the models. In the multivariate setting, overconfidence may be driven by differences in the models' ability to approximate the data generating process in linear combinations of the response variables that may not be particularly important for the investigator.