Walsh Linda
Institute of Radiation Protection, GSF National Center for Environment and Health, Ingolstädter Landstrasse 1, 85764, Neuherberg, Germany.
Radiat Environ Biophys. 2007 Aug;46(3):205-13. doi: 10.1007/s00411-007-0109-0. Epub 2007 Apr 28.
A common type of statistical challenge, widespread across many areas of research, involves the selection of a preferred model to describe the main features and trends in a particular data set. The objective of model selection is to balance the quality of fit to data against the complexity and predictive ability of the model achieving that fit. Several model selection techniques, including two information criteria, which aim to determine which set of model parameters the data best support, are reviewed here. The techniques rely on computing the probabilities of the different models, given the data, rather than considering the allowed values of the fitted parameters. Such information criteria have only been applied to the field of radiation epidemiology recently, even though they have longer traditions of application in other areas of research. The purpose of this review is to make two information criteria more accessible by fully detailing how to calculate them in a practical way and how to interpret the resulting values. This aim is supported with the aid of some examples involving the computation of risk models for radiation-induced solid cancer mortality fitted to the epidemiological data from the Japanese A-bomb survivors. These examples illustrate that the Bayesian information criterion is particularly useful in concluding that the weight of evidence is in favour of excess relative risk models that depend on age-at-exposure and excess relative risk models that depend on age-attained.
一种常见的统计挑战,在许多研究领域广泛存在,涉及选择一个优选模型来描述特定数据集中的主要特征和趋势。模型选择的目标是在数据拟合质量与实现该拟合的模型的复杂性和预测能力之间取得平衡。本文回顾了几种模型选择技术,包括两种信息准则,其旨在确定数据最支持哪一组模型参数。这些技术依赖于在给定数据的情况下计算不同模型的概率,而不是考虑拟合参数的允许值。尽管这些信息准则在其他研究领域有更长的应用传统,但直到最近才应用于辐射流行病学领域。本综述的目的是通过全面详细地说明如何以实际方式计算它们以及如何解释所得值,使这两种信息准则更容易理解。通过一些涉及根据日本原子弹幸存者的流行病学数据拟合辐射诱发实体癌死亡率风险模型计算的例子来支持这一目标。这些例子表明,贝叶斯信息准则在得出证据权重有利于依赖暴露年龄的超额相对风险模型和依赖达到年龄的超额相对风险模型方面特别有用。