使用机器学习和DeepSeek-R1对胆囊癌进行术前T分期鉴别

Pre-operative T-stage discrimination in gallbladder cancer using machine learning and DeepSeek-R1.

作者信息

Chae Joongwon, Wang Zhenyu, Wu Duanpo, Zhang Lian, Tuzikov Alexander, Madiyevich Magrupov Talat, Xu Min, Yu Dongmei, Qin Peiwu

机构信息

Institute of Biopharmaceutical and Health Engineering, Shenzhen International Graduate School, Tsinghua University, Shenzhen, Guangdong, China.

School of Communication Engineering and the Artificial Intelligence Institute, Hangzhou Dianzi University, Hangzhou, Zhejiang, China.

出版信息

Front Oncol. 2025 Aug 1;15:1613462. doi: 10.3389/fonc.2025.1613462. eCollection 2025.

DOI:10.3389/fonc.2025.1613462

PMID:40823085

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12355213/

Abstract

BACKGROUND

Gallbladder cancer (GBC) frequently exhibits non-specific early symptoms, delaying diagnosis. This study (i) assessed whether routine blood biomarkers can distinguish early T stages via machine learning and (ii) compared the T-stage discrimination performance of a large language model (DeepSeek-R1) when supplied with (a) radiology-report text alone versus (b) radiology-report text plus blood-biomarker values.

METHODS

We retrospectively analyzed 232 pathologically confirmed GBC patients treated at Lishui Central Hospital between 2023 and 2024 (T1, = 51; T2, = 181). Seven blood variables-neutrophil-to-lymphocyte ratio (NLR), monocyte-to-lymphocyte ratio (MLR), platelet-tolymphocyte ratio (PLR), carcino-embryonic antigen (CEA), carbohydrate antigen 19-9 (CA19-9), carbohydrate antigen 125 (CA125), and alpha-fetoprotein (AFP)-were used to train Random forest, Support Vector Machine (SVC), XGBoost, and LightGBM models. Synthetic Minority Over-sampling Technique (SMOTE) was applied only to the training folds in one setting and omitted in another. Model performance was evaluated on an independent test set ( = 47) by the area under the receiver-operating-characteristic curve (AUROC, 95% CI by 1 000-sample bootstrap confidence interval, CI); cross-validation (CV) accuracy served as a supplementary metric. DeepSeek-R1 was prompted in a zero-shot, chain-of-thought manner to classify T1 versus T2 using (a) the radiology report alone or (b) the report plus the patient's biomarker profile.

RESULTS

Biomarker-based machine-learning models yielded uniformly poor T-stage discrimination. Without SMOTE, individual models such as XGBoost achieved an AUROC of 0.508 on the independent test set, while recall for the T1 class remained low (e.g., 14.3% for some models), indicating performance near random chance. Applying SMOTE to the training data produced statistically significant gains in cross-validation (CV) accuracy for several models (e.g., XGBoost CV Acc. 0.71 → 0.80, = 0.005; LGBM CV Acc. [] → [], = 0.004). However, these improvements did not translate to better discrimination on the independent test set; for instance, XGBoost's AUROC decreased from 0.508 to 0.473 after SMOTE application. Overall, the biomarker models failed to provide clinically meaningful T-stage differentiation. DeepSeek-R1 analyzing radiology text alone reached 89.6% accuracy on the full 232-patient cohort dataset, and consistently flagged T2 cases on phrases such as "gallbladder wall thickening." Supplying biomarker values did not change accuracy (89.6%).

CONCLUSIONS

The evaluated blood biomarkers did independently aid early T-stage discrimination, and SMOTE offered no meaningful performance gain. Conversely, a radiologytext-driven large language model delivered high accuracy with interpretable rationale, highlighting its potential to guide surgical strategy in GBC. Prospective multi-center studies with larger cohorts are warranted to confirm these findings.

摘要

背景

胆囊癌（GBC）通常表现出非特异性的早期症状，从而延误诊断。本研究（i）评估常规血液生物标志物是否可通过机器学习区分早期T分期，以及（ii）比较大型语言模型（DeepSeek-R1）在提供（a）仅放射学报告文本与（b）放射学报告文本加血液生物标志物值时的T分期判别性能。

方法

我们回顾性分析了2023年至2024年在丽水市中心医院接受治疗的232例经病理确诊的GBC患者（T1期，n = 51；T2期，n = 181）。使用七个血液变量——中性粒细胞与淋巴细胞比值（NLR）、单核细胞与淋巴细胞比值（MLR）、血小板与淋巴细胞比值（PLR）、癌胚抗原（CEA）、糖类抗原19-9（CA19-9）、糖类抗原125（CA125）和甲胎蛋白（AFP）——训练随机森林、支持向量机（SVC）、XGBoost和LightGBM模型。合成少数过采样技术（SMOTE）仅在一种设置下应用于训练折，而在另一种设置中省略。通过受试者操作特征曲线下面积（AUROC，95% CI通过1000样本自助置信区间，CI）在独立测试集（n = 47）上评估模型性能；交叉验证（CV）准确性作为补充指标。以零样本、思维链方式提示DeepSeek-R1使用（a）仅放射学报告或（b）报告加患者生物标志物概况对T1与T2进行分类。

结果

基于生物标志物的机器学习模型在T分期判别方面均表现不佳。在没有SMOTE的情况下，诸如XGBoost等单个模型在独立测试集上的AUROC为0.508，而T1类别的召回率仍然较低（例如，某些模型为14.3%），表明性能接近随机水平。将SMOTE应用于训练数据在几个模型的交叉验证（CV）准确性方面产生了统计学上的显著提高（例如，XGBoost CV Acc. 0.71 → 0.80，p = 0.005；LGBM CV Acc. [] → []，p = 0.004）。然而，这些改进并未转化为在独立测试集上更好的判别；例如，应用SMOTE后XGBoost的AUROC从0.508降至0.473。总体而言，生物标志物模型未能提供具有临床意义的T分期区分。仅分析放射学文本的DeepSeek-R1在完整的232例患者队列数据集上的准确率达到89.6%，并始终在诸如“胆囊壁增厚”等短语上标记T2病例。提供生物标志物值并未改变准确率（89.6%）。

结论

所评估的血液生物标志物无助于独立进行早期T分期判别，且SMOTE未带来有意义的性能提升。相反，基于放射学文本的大型语言模型提供了具有可解释原理的高精度，突出了其在指导GBC手术策略方面的潜力。有必要进行更大队列的前瞻性多中心研究以证实这些发现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/32e4/12355213/b23a42a5ee04/fonc-15-1613462-g001.jpg

相似文献

Pre-operative T-stage discrimination in gallbladder cancer using machine learning and DeepSeek-R1.使用机器学习和DeepSeek-R1对胆囊癌进行术前T分期鉴别

Front Oncol. 2025 Aug 1;15:1613462. doi: 10.3389/fonc.2025.1613462. eCollection 2025.

Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。

Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

Prescription of Controlled Substances: Benefits and Risks管制药品的处方：益处与风险

Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.缺失数据的存在是否会影响 SORG 机器学习算法在脊柱转移瘤患者中的性能？开发一种互联网应用算法。

Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中，如果患者出现以下症状和体征，可判断其是否患有 COVID-19。

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗？

Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.

Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.系统性药理学治疗慢性斑块状银屑病：网络荟萃分析。

Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.

Systemic treatments for metastatic cutaneous melanoma.转移性皮肤黑色素瘤的全身治疗

Cochrane Database Syst Rev. 2018 Feb 6;2(2):CD011123. doi: 10.1002/14651858.CD011123.pub2.

Sexual Harassment and Prevention Training性骚扰与预防培训

Plasma and cerebrospinal fluid amyloid beta for the diagnosis of Alzheimer's disease dementia and other dementias in people with mild cognitive impairment (MCI).血浆和脑脊液β淀粉样蛋白用于诊断轻度认知障碍（MCI）患者的阿尔茨海默病性痴呆及其他痴呆。

Cochrane Database Syst Rev. 2014 Jun 10;2014(6):CD008782. doi: 10.1002/14651858.CD008782.pub4.

本文引用的文献

Assessing Large Language Models for Oncology Data Inference From Radiology Reports.评估用于从放射学报告中进行肿瘤学数据推断的大语言模型。

JCO Clin Cancer Inform. 2024 Dec;8:e2400126. doi: 10.1200/CCI.24.00126. Epub 2024 Dec 11.

Research progress on prognostic factors of gallbladder carcinoma.胆囊癌预后因素的研究进展。

J Cancer Res Clin Oncol. 2024 Oct 6;150(10):447. doi: 10.1007/s00432-024-05975-0.

Machine learning-based diagnostic model for preoperative differentiation between xanthogranulomatous cholecystitis and gallbladder carcinoma: a multicenter retrospective cohort study.基于机器学习的黄色肉芽肿性胆囊炎与胆囊癌术前鉴别诊断模型：一项多中心回顾性队列研究

Front Oncol. 2024 Feb 27;14:1355927. doi: 10.3389/fonc.2024.1355927. eCollection 2024.

Applications of artificial intelligence in biliary tract cancers.人工智能在胆道癌中的应用。

Indian J Gastroenterol. 2024 Aug;43(4):717-728. doi: 10.1007/s12664-024-01518-0. Epub 2024 Mar 1.

Haematologic biomarkers and survival in gallbladder cancer: a systematic review and meta-analysis.血液学生物标志物与胆囊癌生存：一项系统评价与荟萃分析

Ecancermedicalscience. 2024 Jan 30;18:1660. doi: 10.3332/ecancer.2024.1660. eCollection 2024.

Large language models encode clinical knowledge.大语言模型编码临床知识。

Nature. 2023 Aug;620(7972):172-180. doi: 10.1038/s41586-023-06291-2. Epub 2023 Jul 12.

The Value of Deep Learning in Gallbladder Lesion Characterization.深度学习在胆囊病变特征描述中的价值。

Diagnostics (Basel). 2023 Feb 13;13(4):704. doi: 10.3390/diagnostics13040704.

Gallbladder cancer.胆囊癌。

Nat Rev Dis Primers. 2022 Oct 27;8(1):69. doi: 10.1038/s41572-022-00398-y.

BioGPT: generative pre-trained transformer for biomedical text generation and mining.BioGPT：用于生物医学文本生成和挖掘的生成式预训练转换器。

Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac409.

Machine Learning for Endometrial Cancer Prediction and Prognostication.用于子宫内膜癌预测和预后评估的机器学习

Front Oncol. 2022 Jul 27;12:852746. doi: 10.3389/fonc.2022.852746. eCollection 2022.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用机器学习和DeepSeek-R1对胆囊癌进行术前T分期鉴别

Pre-operative T-stage discrimination in gallbladder cancer using machine learning and DeepSeek-R1.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献