Hanley J A, McNeil B J
J Chronic Dis. 1982;35(8):601-11. doi: 10.1016/0021-9681(82)90012-1.
Discriminant Analysis and other related statistical techniques are frequently used to sort patients into those most likely and those least likely to benefit from a certain intervention. Considerable data analysis and computation are often required to arrive at the best-fitting mathematical model which translates discriminating variables or indicants into probability predictions regarding the presence or absence of disease or the likelihood of a favourable outcome. Attempts to judge how well discriminant analysis performs or to determine why it does not perform better are hampered by not knowing what is the greatest degree of discrimination theoretically possible in a data set. In this paper we describe a method of calculating the maximum discrimination attainable in a data set and show how it can be used (1) to decide whether further model building is worthwhile, and (2) if so, to judge the discriminatory performance of any such models. We apply this tool to two previously published studies of radiologic utilization; the results provide reassurance that, at least on the basis of the presenting indicants, the patients were being adequately selected for the studies in question.
判别分析和其他相关统计技术经常被用于将患者分为最有可能和最不可能从某种干预措施中获益的两类。通常需要进行大量的数据分析和计算,才能得出最合适的数学模型,该模型将判别变量或指标转化为关于疾病存在与否或有利结果可能性的概率预测。由于不知道数据集中理论上可能达到的最大判别程度,因此判断判别分析的执行情况或确定其为何表现不佳的尝试受到了阻碍。在本文中,我们描述了一种计算数据集中可达到的最大判别的方法,并展示了如何使用它(1)来决定进一步的模型构建是否值得,以及(2)如果值得,来判断任何此类模型的判别性能。我们将这个工具应用于两项先前发表的关于放射学利用的研究;结果表明,至少根据所呈现的指标,这些研究中的患者被充分地进行了选择。