Suppr超能文献

cfMethylPre:深度迁移学习基于循环游离DNA甲基化谱分析增强癌症检测。

cfMethylPre: deep transfer learning enhances cancer detection based on circulating cell-free DNA methylation profiling.

作者信息

Zhang Xuchao, Chen Jing, Wang Yongtian, Wang Xiaofeng, Hu Jialu, Peng Jiajie, Shang Xuequn, Wang Yanpu, Wang Tao

机构信息

School of Computer Science, Northwestern Polytechnical University, 1 Dongxiang Rd., Xi'an 710072, Shaanxi, China.

Key Laboratory of Big Data Storage and Management, Ministry of Industry and Information Technology, Northwestern Polytechnical University, 1 Dongxiang Rd., Xi'an 710072, Shaanxi, China.

出版信息

Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf303.

Abstract

Cancer remains a significant global health burden, underscoring the need for innovative diagnostic tools to enable early detection and improve patient outcomes. While circulating cell-free DNA (cfDNA) methylation has emerged as a promising biomarker for noninvasive cancer diagnostics, existing methods often face limitations in handling the high-dimensionality of methylation data, small sample sizes, and a lack of biological interpretability. To address these challenges, we propose cfMethylPre, a novel deep transfer learning framework tailored for cancer detection using cfDNA methylation data. cfMethylPre leverages large language model pretrained embeddings from DNA sequence information and integrates them with methylation profiles to enhance feature representation. The deep transfer learning process involves pretraining on bulk DNA methylation data encompassing 2801 samples across 82 cancer types and normal controls, followed by fine-tuning with cfDNA methylation data. This approach ensures robust adaptation to cfDNA's unique characteristics while improving predictive accuracy. Our model achieved superior predictive accuracy compared with state-of-the-art methods, with a weighted Matthews Correlation Coefficient of 0.926 and a weighted F1-score of 0.942. Through model interpretation and biological experimental validation, we identified three novel breast cancer genes-PCDHA10, PRICKLE2, and PRTG-demonstrating their inhibitory effects on cell proliferation and migration in breast cancer cell lines. These findings establish cfMethylPre as a powerful and interpretable tool for cancer diagnostics and biological discovery, paving the way for its application in precision oncology.

摘要

癌症仍然是一项重大的全球健康负担,这凸显了对创新诊断工具的需求,以实现早期检测并改善患者预后。虽然循环游离DNA(cfDNA)甲基化已成为无创癌症诊断中有前景的生物标志物,但现有方法在处理甲基化数据的高维度、小样本量以及缺乏生物学可解释性方面常常面临局限性。为应对这些挑战,我们提出了cfMethylPre,这是一种专为使用cfDNA甲基化数据进行癌症检测量身定制的新型深度迁移学习框架。cfMethylPre利用从DNA序列信息预训练的大语言模型嵌入,并将其与甲基化谱整合以增强特征表示。深度迁移学习过程包括在涵盖82种癌症类型和正常对照的2801个样本的大量DNA甲基化数据上进行预训练,随后使用cfDNA甲基化数据进行微调。这种方法确保了对cfDNA独特特征的稳健适应,同时提高了预测准确性。与现有最先进方法相比,我们的模型实现了更高的预测准确性,加权马修斯相关系数为0.926,加权F1分数为0.942。通过模型解释和生物学实验验证,我们鉴定出三个新型乳腺癌基因——PCDHA10、PRICKLE2和PRTG——证明了它们对乳腺癌细胞系中细胞增殖和迁移的抑制作用。这些发现确立了cfMethylPre作为癌症诊断和生物学发现的强大且可解释工具的地位,为其在精准肿瘤学中的应用铺平了道路。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0138/12206449/02ca1bcd03a5/bbaf303f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验