Department of Electrical Engineering ESAT/SCD, Katholieke Universiteit Leuven, Leuven, Belgium.
Ultrasound Obstet Gynecol. 2011 Jan;37(1):100-6. doi: 10.1002/uog.8813.
The aim of this study was to establish when a second-stage diagnostic test may be of value in cases where a primary diagnostic test has given an uncertain diagnosis of the benign or malignant nature of an adnexal mass.
The diagnostic performance with regard to discrimination between benign and malignant adnexal masses for mathematical models including ultrasound variables and for subjective evaluation of ultrasound findings by an experienced ultrasound examiner was expressed as area under the receiver-operating characteristics curve (AUC), sensitivity and specificity. These were calculated for the total study population of 1938 patients with an adnexal mass as well as for subpopulations defined by the certainty with which the diagnosis of benignity or malignancy was made. The effect of applying a second-stage test to the tumors where risk estimation was uncertain was determined.
The best mathematical model (LR1) had an AUC of 0.95, sensitivity of 92% and specificity of 84% when applied to all tumors. When model LR1 was applied to the 10% of tumors in which the calculated risk fell closest to the risk cut-off of the model, the AUC was 0.59, sensitivity 90% and specificity 21%. A strategy where subjective evaluation was used to classify these 10% of tumors for which LR1 performed poorly and where LR1 was used in the other 90% of tumors resulted in a sensitivity of 91% and specificity of 90%. Applying subjective evaluation to all tumors yielded an AUC of 0.95, sensitivity of 90% and specificity of 93%. Sensitivity was 81% and specificity 47% for those patients where the ultrasound examiner was uncertain about the diagnosis (n = 115; 5.9%). No mathematical model performed better than did subjective evaluation among the 115 tumors where the ultrasound examiner was uncertain.
When model LR1 is used as a primary test for discriminating between benign and malignant adnexal masses, the use of subjective evaluation of ultrasound findings by an experienced examiner as a second-stage test in the 10% of cases for which the model yields a risk of malignancy closest to its risk cut-off will improve specificity without substantially decreasing sensitivity. However, none of the models tested proved suitable as a second-stage test in tumors where subjective evaluation yielded an uncertain result.
本研究旨在确定在对附件肿块的良恶性性质进行初步诊断测试后,如果诊断结果不确定,何时进行二阶诊断测试可能具有价值。
使用数学模型(包括超声变量)评估良恶性附件肿块的诊断性能,并由经验丰富的超声检查者对超声检查结果进行主观评估,得出受试者工作特征曲线下面积(AUC)、敏感性和特异性。对 1938 例附件肿块患者的总研究人群以及根据良性或恶性诊断确定性定义的亚组人群进行了这些计算。确定了将二阶测试应用于风险估计不确定的肿瘤的效果。
当应用于所有肿瘤时,最佳数学模型(LR1)的 AUC 为 0.95、敏感性为 92%、特异性为 84%。当将模型 LR1 应用于计算风险最接近模型风险截止值的 10%肿瘤时,AUC 为 0.59、敏感性为 90%、特异性为 21%。对于 LR1 表现不佳的这 10%肿瘤,使用主观评估对其进行分类,并在其他 90%的肿瘤中使用 LR1 的策略,可使敏感性达到 91%,特异性达到 90%。对所有肿瘤进行主观评估的 AUC 为 0.95、敏感性为 90%、特异性为 93%。对于超声检查者对诊断不确定的 115 例患者(n = 115;5.9%),敏感性为 81%,特异性为 47%。在超声检查者对 115 例肿瘤结果不确定的情况下,没有任何数学模型比主观评估表现更好。
当使用 LR1 模型作为区分良性和恶性附件肿块的主要测试时,在模型得出的恶性风险最接近其风险截止值的 10%病例中,使用经验丰富的检查者对超声检查结果进行主观评估作为二阶测试,将提高特异性而不会显著降低敏感性。然而,在主观评估结果不确定的肿瘤中,测试的模型均不适合作为二阶测试。