Niu Limin, Zhou Weicheng, Li Xiao, Zhao Jinming, Li Lei, Song Xingguo
Department of Clinical Laboratory, Shandong Cancer Hospital and Institute, Shandong First Medical University, Shandong Academy of Medical Sciences, Jiyan Road 440#, Jinan, 250117, Shandong, PR China.
Core & Molecular Lab (CML), Roche Diagnostics (Shanghai) Limited, Shanghai, PR China.
Sci Rep. 2025 Aug 20;15(1):30586. doi: 10.1038/s41598-025-15431-9.
PIWI-interacting RNAs (piRNAs) have been implicated in the biological processes of various cancers. This study aimed to investigate the diagnostic potential of circulating piRNAs in breast cancer (BC) using machine learning (ML) frameworks. A serum tri-piRNA signature (piR-139966, piR-2572505, piR-2570061) was selected via piRNA sequencing, validated by qPCR, and then analyzed in combination with related clinical factors. Predictive ML models for early diagnosis of BC combining piRNA expression with CA153 were constructed using 10 ML algorithms and evaluated by 8 performance metrics. Serum levels of piR-139966, piR-2572505, and piR-2570061 were significantly upregulated in early-stage BC patients compared to matched healthy controls. This tri-piRNA panel demonstrated enhanced diagnostic precision for BC detection and exhibited complementary value to CA153 measurements, whether used alone or combined. Through systematic ML optimization, we developed a stratified diagnostic model where XGBoost algorithm showed optimal performance in both training and validation cohorts for early-stage BC identification. With XGBoost algorithms applied to piRNA expression along with CA153, we developed and validated a predictive ML model with superior diagnostic accuracy compared to conventional approaches.
PIWI相互作用RNA(piRNA)与多种癌症的生物学过程有关。本研究旨在使用机器学习(ML)框架研究循环piRNA在乳腺癌(BC)中的诊断潜力。通过piRNA测序选择了一种血清三联piRNA特征(piR-139966、piR-2572505、piR-2570061),经qPCR验证,然后结合相关临床因素进行分析。使用10种ML算法构建了结合piRNA表达与CA153用于BC早期诊断的预测ML模型,并通过8种性能指标进行评估。与匹配的健康对照相比,早期BC患者血清中piR-139966、piR-2572505和piR-2570061的水平显著上调。无论单独使用还是联合使用,这种三联piRNA检测方法在BC检测中都显示出更高的诊断精度,并且对CA153测量具有互补价值。通过系统的ML优化,我们开发了一种分层诊断模型,其中XGBoost算法在早期BC识别的训练和验证队列中均表现出最佳性能。将XGBoost算法应用于piRNA表达以及CA153,我们开发并验证了一种预测ML模型,其诊断准确性优于传统方法。
Technol Cancer Res Treat. 2024
Cancer Epidemiol Biomarkers Prev. 2020-5
Comput Methods Programs Biomed. 2025-6-21
Biomarkers. 2025-3
CA Cancer J Clin. 2025
CA Cancer J Clin. 2024
Breast Cancer. 2025-1
Sci China Life Sci. 2024-4
Mol Cancer. 2023-3-7
Nat Rev Mol Cell Biol. 2023-2
Mol Biol Rep. 2022-10