Suppr超能文献

使用全血细胞计数的结直肠癌检测堆叠随机森林模型

Stacked random forest model for colorectal cancer detection using complete blood counts.

作者信息

Luo Junfeng, Tan Weiwei, Chen Shaobo, Chen Yijing, Fu Ya, Jing Xiaojuan, Kang Lingling, Li Qingyun, Ma Zhenjian, Sun Tingji, Xiao Peng, Xue Shigui, Wang Xiaozhi, Zhang Houde

机构信息

Department of Gastroenterology, Nanshan Hospital, Guangdong Medical University, Shenzhen, China.

Department of Neurology, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China.

出版信息

Digit Health. 2025 Jul 29;11:20552076251362072. doi: 10.1177/20552076251362072. eCollection 2025 Jan-Dec.

Abstract

BACKGROUND

In China, adherence to screening colonoscopy among eligible individuals remains suboptimal, primarily due to cost concerns and potential adverse effects. A machine learning model utilizing complete blood count (CBC) data could help prioritize colonoscopy referrals and improve screening participation.

METHOD

This multicenter study included participants who underwent CBC testing within three months before colonoscopy. CBC data were classified into three types (A, B, and C) based on hematology analyzer capabilities, with Type C excluded from analysis. Using Types A and B, we developed a stacking machine learning model incorporating 24 CBC features and 5 combined CBC components to predict colorectal cancer (CRC). Model performance was evaluated using the area under the curve (AUC), specificity, and sensitivity.

RESULTS

The study included 1795 CRC cases and 26,380 cancer-free individuals with CBC data. On external validation, the model achieved 80.3% specificity and 65.2% sensitivity. Notably, it demonstrated 41% sensitivity for Stage I CRC and 57.6% sensitivity for Stages I-III combined.

CONCLUSIONS

CBC testing, combined with electronic medical record data, is a low-cost and widely accessible tool. Our robust CRC risk prediction model can serve as a preliminary screening method, aiding in colonoscopy referral decisions and improving CRC screening efficiency.

摘要

背景

在中国,符合条件的个体对结肠镜筛查的依从性仍不理想,主要原因是费用问题和潜在的不良反应。利用全血细胞计数(CBC)数据的机器学习模型有助于确定结肠镜检查转诊的优先级并提高筛查参与率。

方法

这项多中心研究纳入了在结肠镜检查前三个月内接受CBC检测的参与者。根据血液分析仪的功能,CBC数据被分为三种类型(A、B和C),C型被排除在分析之外。利用A型和B型数据,我们开发了一个堆叠机器学习模型,该模型纳入了24个CBC特征和5个CBC组合成分,以预测结直肠癌(CRC)。使用曲线下面积(AUC)、特异性和敏感性来评估模型性能。

结果

该研究纳入了1795例CRC病例和26380名无癌个体的CBC数据。在外部验证中,该模型的特异性为80.3%,敏感性为65.2%。值得注意的是,它对I期CRC的敏感性为41%,对I-III期联合的敏感性为57.6%。

结论

CBC检测与电子病历数据相结合是一种低成本且广泛可用的工具。我们强大的CRC风险预测模型可作为一种初步筛查方法,有助于结肠镜检查转诊决策并提高CRC筛查效率。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验