• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

cfMethylPre:深度迁移学习基于循环游离DNA甲基化谱分析增强癌症检测。

cfMethylPre: deep transfer learning enhances cancer detection based on circulating cell-free DNA methylation profiling.

作者信息

Zhang Xuchao, Chen Jing, Wang Yongtian, Wang Xiaofeng, Hu Jialu, Peng Jiajie, Shang Xuequn, Wang Yanpu, Wang Tao

机构信息

School of Computer Science, Northwestern Polytechnical University, 1 Dongxiang Rd., Xi'an 710072, Shaanxi, China.

Key Laboratory of Big Data Storage and Management, Ministry of Industry and Information Technology, Northwestern Polytechnical University, 1 Dongxiang Rd., Xi'an 710072, Shaanxi, China.

出版信息

Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf303.

DOI:10.1093/bib/bbaf303
PMID:40581983
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12206449/
Abstract

Cancer remains a significant global health burden, underscoring the need for innovative diagnostic tools to enable early detection and improve patient outcomes. While circulating cell-free DNA (cfDNA) methylation has emerged as a promising biomarker for noninvasive cancer diagnostics, existing methods often face limitations in handling the high-dimensionality of methylation data, small sample sizes, and a lack of biological interpretability. To address these challenges, we propose cfMethylPre, a novel deep transfer learning framework tailored for cancer detection using cfDNA methylation data. cfMethylPre leverages large language model pretrained embeddings from DNA sequence information and integrates them with methylation profiles to enhance feature representation. The deep transfer learning process involves pretraining on bulk DNA methylation data encompassing 2801 samples across 82 cancer types and normal controls, followed by fine-tuning with cfDNA methylation data. This approach ensures robust adaptation to cfDNA's unique characteristics while improving predictive accuracy. Our model achieved superior predictive accuracy compared with state-of-the-art methods, with a weighted Matthews Correlation Coefficient of 0.926 and a weighted F1-score of 0.942. Through model interpretation and biological experimental validation, we identified three novel breast cancer genes-PCDHA10, PRICKLE2, and PRTG-demonstrating their inhibitory effects on cell proliferation and migration in breast cancer cell lines. These findings establish cfMethylPre as a powerful and interpretable tool for cancer diagnostics and biological discovery, paving the way for its application in precision oncology.

摘要

癌症仍然是一项重大的全球健康负担,这凸显了对创新诊断工具的需求,以实现早期检测并改善患者预后。虽然循环游离DNA(cfDNA)甲基化已成为无创癌症诊断中有前景的生物标志物,但现有方法在处理甲基化数据的高维度、小样本量以及缺乏生物学可解释性方面常常面临局限性。为应对这些挑战,我们提出了cfMethylPre,这是一种专为使用cfDNA甲基化数据进行癌症检测量身定制的新型深度迁移学习框架。cfMethylPre利用从DNA序列信息预训练的大语言模型嵌入,并将其与甲基化谱整合以增强特征表示。深度迁移学习过程包括在涵盖82种癌症类型和正常对照的2801个样本的大量DNA甲基化数据上进行预训练,随后使用cfDNA甲基化数据进行微调。这种方法确保了对cfDNA独特特征的稳健适应,同时提高了预测准确性。与现有最先进方法相比,我们的模型实现了更高的预测准确性,加权马修斯相关系数为0.926,加权F1分数为0.942。通过模型解释和生物学实验验证,我们鉴定出三个新型乳腺癌基因——PCDHA10、PRICKLE2和PRTG——证明了它们对乳腺癌细胞系中细胞增殖和迁移的抑制作用。这些发现确立了cfMethylPre作为癌症诊断和生物学发现的强大且可解释工具的地位,为其在精准肿瘤学中的应用铺平了道路。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0138/12206449/c8e66bcf1e07/bbaf303f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0138/12206449/02ca1bcd03a5/bbaf303f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0138/12206449/2a3fd39b5548/bbaf303f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0138/12206449/950ce202a337/bbaf303f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0138/12206449/58a2fee29a57/bbaf303f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0138/12206449/782edb90b1e5/bbaf303f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0138/12206449/3263d4526ca2/bbaf303f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0138/12206449/554b0c4bde1b/bbaf303f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0138/12206449/c8e66bcf1e07/bbaf303f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0138/12206449/02ca1bcd03a5/bbaf303f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0138/12206449/2a3fd39b5548/bbaf303f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0138/12206449/950ce202a337/bbaf303f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0138/12206449/58a2fee29a57/bbaf303f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0138/12206449/782edb90b1e5/bbaf303f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0138/12206449/3263d4526ca2/bbaf303f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0138/12206449/554b0c4bde1b/bbaf303f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0138/12206449/c8e66bcf1e07/bbaf303f8.jpg

相似文献

1
cfMethylPre: deep transfer learning enhances cancer detection based on circulating cell-free DNA methylation profiling.cfMethylPre:深度迁移学习基于循环游离DNA甲基化谱分析增强癌症检测。
Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf303.
2
Cell-free epigenomes enhanced fragmentomics-based model for early detection of lung cancer.基于无细胞表观基因组增强片段组学的肺癌早期检测模型
Clin Transl Med. 2025 Feb;15(2):e70225. doi: 10.1002/ctm2.70225.
3
A deep learning approach to direct immunofluorescence pattern recognition in autoimmune bullous diseases.深度学习方法在自身免疫性大疱性疾病中的直接免疫荧光模式识别。
Br J Dermatol. 2024 Jul 16;191(2):261-266. doi: 10.1093/bjd/ljae142.
4
Diagnostic accuracy of cfDNA levels in gallbladder cancer: A study using qPCR & threshold evaluation.游离DNA水平在胆囊癌诊断中的准确性:一项使用定量聚合酶链反应和阈值评估的研究
Indian J Med Res. 2025 Feb;161(2):143-151. doi: 10.25259/IJMR_1651_2024.
5
Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.利用基础模型库进行跨设备肿瘤显微镜检查中的细胞相似性搜索。
Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.
6
Stabilizing machine learning for reproducible and explainable results: A novel validation approach to subject-specific insights.稳定机器学习以获得可重复和可解释的结果:一种针对特定个体见解的新型验证方法。
Comput Methods Programs Biomed. 2025 Jun 21;269:108899. doi: 10.1016/j.cmpb.2025.108899.
7
Discovery and validation of cell-free DNA methylation markers for specific diagnosis, differentiation from benign tumors, and prognosis of breast cancer.用于乳腺癌特异性诊断、与良性肿瘤鉴别及预后的游离DNA甲基化标志物的发现与验证
Breast Cancer Res. 2025 Jun 16;27(1):108. doi: 10.1186/s13058-025-02066-x.
8
Multicenter Histology Image Integration and Multiscale Deep Learning for Machine Learning-Enabled Pediatric Sarcoma Classification.用于支持机器学习的小儿肉瘤分类的多中心组织学图像整合与多尺度深度学习
medRxiv. 2025 Jun 11:2025.06.10.25328700. doi: 10.1101/2025.06.10.25328700.
9
Predicting cognitive decline: Deep-learning reveals subtle brain changes in pre-MCI stage.预测认知衰退:深度学习揭示轻度认知障碍前阶段大脑的细微变化。
J Prev Alzheimers Dis. 2025 May;12(5):100079. doi: 10.1016/j.tjpad.2025.100079. Epub 2025 Feb 6.
10
Enhancing Pulmonary Disease Prediction Using Large Language Models With Feature Summarization and Hybrid Retrieval-Augmented Generation: Multicenter Methodological Study Based on Radiology Report.使用具有特征总结和混合检索增强生成功能的大语言模型增强肺部疾病预测:基于放射学报告的多中心方法学研究
J Med Internet Res. 2025 Jun 11;27:e72638. doi: 10.2196/72638.

引用本文的文献

1
From Detection to Prediction: Advances in m6A Methylation Analysis Through Machine Learning and Deep Learning with Implications in Cancer.从检测到预测:通过机器学习和深度学习实现的m6A甲基化分析进展及其在癌症中的意义
Int J Mol Sci. 2025 Jul 12;26(14):6701. doi: 10.3390/ijms26146701.

本文引用的文献

1
CRBPSA: CircRNA-RBP interaction sites identification using sequence structural attention model.利用序列结构注意力模型识别 circRNA-RBP 相互作用位点。
BMC Biol. 2024 Nov 14;22(1):260. doi: 10.1186/s12915-024-02055-0.
2
Identifying the "stripe" transcription factors and cooperative binding related to DNA methylation.鉴定与 DNA 甲基化相关的“条纹”转录因子和协同结合。
Commun Biol. 2024 Oct 5;7(1):1265. doi: 10.1038/s42003-024-06992-y.
3
MEHunter: transformer-based mobile element variant detection from long reads.MEHunter:基于 Transformer 的长读段上移动元件变异检测。
Bioinformatics. 2024 Sep 2;40(9). doi: 10.1093/bioinformatics/btae557.
4
Unraveling the Role of PCDH9 in Breast Cancer and Identifying Therapeutic Strategies for PCDH9-Deficient Tumors.解析原钙黏蛋白9在乳腺癌中的作用并确定针对原钙黏蛋白9缺陷型肿瘤的治疗策略。
Breast Cancer (Dove Med Press). 2024 Sep 9;16:583-593. doi: 10.2147/BCTT.S476083. eCollection 2024.
5
Exploring causal effects of sarcopenia on risk and progression of Parkinson disease by Mendelian randomization.通过孟德尔随机化研究肌肉减少症对帕金森病风险和进展的因果效应。
NPJ Parkinsons Dis. 2024 Aug 28;10(1):164. doi: 10.1038/s41531-024-00782-3.
6
Deep learning model integrating cfDNA methylation and fragment size profiles for lung cancer diagnosis.基于 cfDNA 甲基化和片段大小特征的深度学习模型用于肺癌诊断。
Sci Rep. 2024 Jun 26;14(1):14797. doi: 10.1038/s41598-024-63411-2.
7
Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries.2022 年全球癌症统计数据:全球 185 个国家和地区 36 种癌症的发病率和死亡率全球估计数。
CA Cancer J Clin. 2024 May-Jun;74(3):229-263. doi: 10.3322/caac.21834. Epub 2024 Apr 4.
8
Kled: an ultra-fast and sensitive structural variant detection tool for long-read sequencing data.Kled:一种用于长读测序数据的超快速和敏感的结构变异检测工具。
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae049.
9
postGWAS: A web server for deciphering the causality post the genome-wide association studies.postGWAS:一个用于在全基因组关联研究后解码因果关系的网络服务器。
Comput Biol Med. 2024 Mar;171:108108. doi: 10.1016/j.compbiomed.2024.108108. Epub 2024 Feb 5.
10
Tumor- and circulating-free DNA methylation identifies clinically relevant small cell lung cancer subtypes.肿瘤游离和循环游离 DNA 甲基化可识别具有临床意义的小细胞肺癌亚型。
Cancer Cell. 2024 Feb 12;42(2):225-237.e5. doi: 10.1016/j.ccell.2024.01.001. Epub 2024 Jan 25.