Benish W A
Department of Internal Medicine, Case Western Reserve University, Cleveland, OH 44106, USA.
Methods Inf Med. 2003;42(3):260-4.
This paper demonstrates that diagnostic test performance can be quantified as the average amount of information the test result (R) provides about the disease state (D).
A fundamental concept of information theory, mutual information, is directly applicable to this problem. This statistic quantifies the amount of information that one random variable contains about another random variable. Prior to performing a diagnostic test, R and D are random variables. Hence, their mutual information, I(D;R), is the amount of information that R provides about D.
I(D;R) is a function of both 1). the pretest probabilities of the disease state and 2). the set of conditional probabilities relating each possible test result to each possible disease state. The area under the receiver operating characteristic curve (AUC) is a popular measure of diagnostic test performance which, in contrast to I(D;R), is independent of the pretest probabilities; it is a function of only the set of conditional probabilities. The AUC is not a measure of diagnostic information.
Because I(D;R) is dependent upon pretest probabilities, knowledge of the setting in which a diagnostic test is employed is a necessary condition for quantifying the amount of information it provides. Advantages of I(D;R) over the AUC are that it can be calculated without invoking an arbitrary curve fitting routine, it is applicable to situations in which multiple diagnoses are under consideration, and it quantifies test performance in meaningful units (bits of information).
本文证明诊断测试性能可量化为测试结果(R)提供的关于疾病状态(D)的平均信息量。
信息论的一个基本概念,互信息,可直接应用于此问题。该统计量量化了一个随机变量包含的关于另一个随机变量的信息量。在进行诊断测试之前,R和D是随机变量。因此,它们的互信息I(D;R)就是R提供的关于D的信息量。
I(D;R)是以下两者的函数:1).疾病状态的预测试概率;2).将每个可能的测试结果与每个可能的疾病状态相关联的条件概率集。受试者工作特征曲线(AUC)下的面积是诊断测试性能的一种常用度量,与I(D;R)不同,它与预测试概率无关;它仅是条件概率集的函数。AUC不是诊断信息的度量。
由于I(D;R)取决于预测试概率,因此了解诊断测试的应用环境是量化其提供的信息量的必要条件。I(D;R)相对于AUC的优势在于,它无需调用任意曲线拟合程序即可计算,适用于考虑多种诊断的情况,并且以有意义的单位(信息量比特)量化测试性能。