Suppr超能文献

使用贝叶斯网络进行肺癌检测:一项针对丹麦高危人群的回顾性开发与验证研究。

Lung Cancer Detection Using Bayesian Networks: A Retrospective Development and Validation Study on a Danish Population of High-Risk Individuals.

作者信息

Henriksen Margrethe Bang, Van Daalen Florian, Wee Leonard, Hansen Torben Frøstrup, Jensen Lars Henrik, Brasen Claus Lohman, Hilberg Ole, Bermejo Inigo

机构信息

Department of Oncology, Vejle University Hospital, Vejle, Denmark.

Institute of Regional Health Research, University of Southern Denmark, Odense, Denmark.

出版信息

Cancer Med. 2025 Feb;14(3):e70458. doi: 10.1002/cam4.70458.

Abstract

BACKGROUND

Lung cancer (LC) is the top cause of cancer deaths globally, prompting many countries to adopt LC screening programs. While screening typically relies on age and smoking intensity, more efficient risk models exist. We devised a Bayesian network (BN) for LC detection, testing its resilience with varying degrees of missing data and comparing it to a prior machine learning (ML) model.

METHODS

We analyzed data from 9940 patients referred for LC assessment in Southern Denmark from 2009 to 2018. Variables included age, sex, smoking, and lab results. Our experiments varied missing data (0%-30%), BN structure (expert-based vs. data-driven), and discretization method (standard vs. data-driven).

RESULTS

Across all missing data levels, area under the curve (AUC) remained steady, ranging from 0.737 to 0.757, compared to the ML model's AUC of 0.77. BN structure and discretization method had minimal impact on performance. BNs were well calibrated overall, with a net benefit in decision curve analysis when predicted risk exceeded 5%.

CONCLUSION

BN models showed resilience with up to 30% missing values. Moreover, these BNs exhibited similar performance, calibration, and clinical utility compared to the machine learning model developed using the same dataset. Considering their effectiveness in handling missing data, BNs emerge as a relevant method for the development of future lung cancer detection models.

摘要

背景

肺癌是全球癌症死亡的首要原因,促使许多国家采用肺癌筛查项目。虽然筛查通常依赖于年龄和吸烟强度,但存在更有效的风险模型。我们设计了一种用于肺癌检测的贝叶斯网络(BN),测试其在不同程度缺失数据情况下的弹性,并将其与先前的机器学习(ML)模型进行比较。

方法

我们分析了2009年至2018年在丹麦南部因肺癌评估而转诊的9940例患者的数据。变量包括年龄、性别、吸烟情况和实验室检查结果。我们的实验改变了缺失数据(0% - 30%)、BN结构(基于专家与数据驱动)和离散化方法(标准与数据驱动)。

结果

在所有缺失数据水平下,曲线下面积(AUC)保持稳定,范围从0.737到0.757,而ML模型的AUC为0.77。BN结构和离散化方法对性能的影响最小。BN总体校准良好,当预测风险超过5%时,决策曲线分析显示有净收益。

结论

BN模型在缺失值高达30%的情况下仍表现出弹性。此外,与使用相同数据集开发的机器学习模型相比,这些BN在性能、校准和临床效用方面表现相似。考虑到其在处理缺失数据方面的有效性,BN成为未来肺癌检测模型开发的一种相关方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b80a/11783238/b6223b39942f/CAM4-14-e70458-g004.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验