Qu Limeng, Zhu Jinfeng, Mei Xilong, Yi Zixi, Luo Na, Yuan Songlin, Liu Xuan, Liu Mingwen, Xie Haiqing, Hu Xiongqiang, Pan Liangrui, Liang Qingchun, Li Yanhui, Zou Qiongyan, Zhou Qin, Zhang Danhua, Zhou Meirong, Pei Lei, Qian Ke, Long Qian, Chen Qitong, Chen Xi, Plichta Jennifer K, Shang Qingyao, Ouyang Meishuo, Xu Jiachi, Yi Wenjun
Department of General Surgery, The Second Xiangya Hospital, Central South University, Changsha, Hunan, 410011, China.
Clinical Research Center For Breast Disease In Hunan Province, Changsha, Hunan, 410011, China.
EClinicalMedicine. 2025 Jun 24;85:103311. doi: 10.1016/j.eclinm.2025.103311. eCollection 2025 Jul.
Accurately evaluating axillary lymph nodes (ALNs) is essential for guiding both staging and treatment strategies in breast cancer (BC) patients. Currently, traditional pathological staging methods still rely on invasive biopsies or surgeries. This study aimed to construct, evaluate, and validate a semisupervised classifier utilizing radiomic and machine learning (ML) techniques to noninvasively identify axillary nodal disease.
Data from 4191 ALNs in 494 patients with invasive BC were retrospectively analyzed at the Second Xiangya Hospital of Central South University between January 31, 2016, and July 31, 2024, including a labeled cohort (214 patients, 1769 ALNs, divided into ultra-low and ultra-high risk groups) and an unlabeled cohort (280 patients, 2422 ALNs). Regions of interest (ROIs) were segmented, and CT radiomic features were extracted. 11 supervised learning models were built on the basis of labeled ALNs, and pseudolabels (low-risk, high-risk groups) were assigned to unlabeled ALNs. Seven ML algorithms developed semisupervised multiclassifiers on the basis of the predicted probabilities for 4191 ALNs. For multicenter validation, additional data were collected from the First People's Hospital of Chenzhou City, the First People's Hospital of Changde City, and the First People's Hospital of Xiangtan City. The best-performing multiclassifier was evaluated in two independent multicenter cohorts: 212 clinically node-positive (cN+) patients who underwent core needle biopsy (CNB) or fine needle aspiration (FNA), and 450 clinically node-negative (cN0) patients. The research was registered at www.isrctn.com with registration number ISRCTN54288903.
The supervised multilayer perceptron (MLP) model, built from labeled ALNs, exhibited excellent classification performance, with an area under the curve (AUC) of 0.959 (95% CI: 0.937-0.981), a sensitivity of 0.899, and a specificity of 0.932. Pseudolabels for the unlabeled ALNs were generated via this model, and the semisupervised MLP multiclassifier (Semi-ALNP) was constructed by combining the labeled and unlabeled data. The AUCs for predicting nodal metastases were 0.906 (95% CI: 0.894-0.917), 0.936 (95% CI: 0.928-0.945), 0.948 (95% CI: 0.940-0.956), and 0.955 (95% CI: 0.946-0.965) for the ultra-low risk, low-risk, high-risk, and ultra-high risk groups, respectively. Validation in both the biopsy and cN0 cohorts revealed strong diagnostic performance: in the biopsy cohort, the model achieved a false negative rate (FNR) of 1.21%, a false positive rate (FPR) of 14.89%, a sensitivity of 98.79%, and a specificity of 85.11%; in the cN0 cohort, the FNR was 8.33%, the FPR was 9.94%, the sensitivity was 91.67%, and the specificity was 90.06%.
Semi-ALNP, which is based on the MLP algorithm, has high accuracy in assessing the statuses of ALNs across all types of BC patients. It is particularly effective for identifying high-risk patients with ALN metastasis, which can help guide personalized treatment decisions. Future prospective studies are planned to further validate the clinical utility of this approach in real-world settings.
This study was funded by the Science and Technology Innovation Program of Hunan Province (Grant No. 2021SK2026) and the Innovation Platform and Talent Plan of Hunan Province (2023SK4019). Funding sources were not involved in the study design, data collection, analysis and interpretation, writing of the report, or decision to submit the article for publication.
准确评估腋窝淋巴结(ALNs)对于指导乳腺癌(BC)患者的分期和治疗策略至关重要。目前,传统的病理分期方法仍依赖于侵入性活检或手术。本研究旨在构建、评估和验证一种利用放射组学和机器学习(ML)技术的半监督分类器,以无创识别腋窝淋巴结疾病。
回顾性分析了2016年1月31日至2024年7月31日在中南大学湘雅二医院的494例浸润性BC患者的4191个ALNs的数据,包括一个标记队列(214例患者,1769个ALNs,分为超低风险和超高风险组)和一个未标记队列(280例患者,2422个ALNs)。对感兴趣区域(ROIs)进行分割,并提取CT放射组学特征。基于标记的ALNs构建了11个监督学习模型,并将伪标签(低风险、高风险组)分配给未标记的ALNs。七种ML算法基于4191个ALNs的预测概率开发了半监督多分类器。为了进行多中心验证,从郴州市第一人民医院、常德市第一人民医院和湘潭市第一人民医院收集了额外的数据。在两个独立的多中心队列中评估了性能最佳的多分类器:212例接受粗针活检(CNB)或细针穿刺(FNA)的临床淋巴结阳性(cN +)患者,以及450例临床淋巴结阴性(cN0)患者。该研究已在www.isrctn.com注册,注册号为ISRCTN54288903。
从标记的ALNs构建的监督多层感知器(MLP)模型表现出优异的分类性能,曲线下面积(AUC)为0.959(95%CI:0.937 - 0.981),灵敏度为0.899,特异性为0.932。通过该模型生成未标记ALNs的伪标签,并通过结合标记和未标记数据构建半监督MLP多分类器(Semi - ALNP)。超低风险、低风险、高风险和超高风险组预测淋巴结转移的AUC分别为0.906(95%CI:0.894 - 0.917)、0.936(95%CI:0.928 - 0.945)、0.948(95%CI:0.940 - 0.956)和0.955(95%CI:0.946 - 0.965)。活检和cN0队列中的验证均显示出强大的诊断性能:在活检队列中,该模型的假阴性率(FNR)为1.21%,假阳性率(FPR)为14.89%,灵敏度为98.79%,特异性为85.11%;在cN0队列中,FNR为8.33%,FPR为9.94%,灵敏度为91.67%,特异性为90.06%。
基于MLP算法的Semi - ALNP在评估所有类型BC患者的ALNs状态方面具有较高的准确性。它在识别有ALN转移的高危患者方面特别有效,这有助于指导个性化治疗决策。计划未来进行前瞻性研究,以进一步验证该方法在实际临床环境中的临床实用性。
本研究由湖南省科技创新计划(批准号2021SK2026)和湖南省创新平台与人才计划(2023SK4019)资助。资助来源未参与研究设计、数据收集、分析和解释、报告撰写或提交文章发表的决策。