Wang Yadong, Guo Qiang, Huang Zhicheng, Song Liyang, Zhao Fei, Gu Tiantian, Feng Zhe, Wang Haibo, Li Bowen, Wang Daoyun, Zhou Bin, Guo Chao, Xu Yuan, Song Yang, Zheng Zhibo, Bing Zhongxing, Li Haochen, Yu Xiaoqing, Fung Ka Luk, Xu Heqing, Shi Jianhong, Chen Meng, Hong Shuai, Jin Haoxuan, Tong Shiyuan, Zhu Sibo, Zhu Chen, Song Jinlei, Liu Jing, Li Shanqing, Li Hefei, Sun Xueguang, Liang Naixin
Department of Thoracic Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.
Department of Thoracic Surgery, Affiliated Hospital of Hebei University, Baoding, China.
Clin Transl Med. 2025 Feb;15(2):e70225. doi: 10.1002/ctm2.70225.
Lung cancer is a leading cause of cancer mortality, highlighting the need for innovative non-invasive early detection methods. Although cell-free DNA (cfDNA) analysis shows promise, its sensitivity in early-stage lung cancer patients remains a challenge. This study aimed to integrate insights from epigenetic modifications and fragmentomic features of cfDNA using machine learning to develop a more accurate lung cancer detection model.
To address this issue, a multi-centre prospective cohort study was conducted, with participants harbouring suspicious malignant lung nodules and healthy volunteers recruited from two clinical centres. Plasma cfDNA was analysed for its epigenetic and fragmentomic profiles using chromatin immunoprecipitation sequencing, reduced representation bisulphite sequencing and low-pass whole-genome sequencing. Machine learning algorithms were then employed to integrate the multi-omics data, aiding in the development of a precise lung cancer detection model.
Cancer-related changes in cfDNA fragmentomics were significantly enriched in specific genes marked by cell-free epigenomes. A total of 609 genes were identified, and the corresponding cfDNA fragmentomic features were utilised to construct the ensemble model. This model achieved a sensitivity of 90.4% and a specificity of 83.1%, with an AUC of 0.94 in the independent validation set. Notably, the model demonstrated exceptional sensitivity for stage I lung cancer cases, achieving 95.1%. It also showed remarkable performance in detecting minimally invasive adenocarcinoma, with a sensitivity of 96.2%, highlighting its potential for early detection in clinical settings.
With feature selection guided by multiple epigenetic sequencing approaches, the cfDNA fragmentomics-based machine learning model demonstrated outstanding performance in the independent validation cohort. These findings highlight its potential as an effective non-invasive strategy for the early detection of lung cancer.
Our study elucidated the regulatory relationships between epigenetic modifications and their effects on fragmentomic features. Identifying epigenetically regulated genes provided a critical foundation for developing the cfDNA fragmentomics-based machine learning model. The model demonstrated exceptional clinical performance, highlighting its substantial potential for translational application in clinical practice.
肺癌是癌症死亡的主要原因,这凸显了对创新的非侵入性早期检测方法的需求。尽管游离DNA(cfDNA)分析显示出前景,但其在早期肺癌患者中的敏感性仍然是一个挑战。本研究旨在利用机器学习整合cfDNA的表观遗传修饰和片段组学特征的见解,以开发更准确的肺癌检测模型。
为解决这一问题,进行了一项多中心前瞻性队列研究,参与者包括来自两个临床中心的患有可疑恶性肺结节的患者和健康志愿者。使用染色质免疫沉淀测序、简化代表性亚硫酸氢盐测序和低深度全基因组测序分析血浆cfDNA的表观遗传和片段组学谱。然后采用机器学习算法整合多组学数据,有助于开发精确的肺癌检测模型。
cfDNA片段组学中与癌症相关的变化在由游离表观基因组标记的特定基因中显著富集。共鉴定出609个基因,并利用相应的cfDNA片段组学特征构建了集成模型。该模型在独立验证集中的敏感性为90.4%,特异性为83.1%,AUC为0.94。值得注意的是,该模型对I期肺癌病例表现出卓越的敏感性,达到95.1%。它在检测微浸润腺癌方面也表现出色,敏感性为96.2%,突出了其在临床环境中早期检测的潜力。
在多种表观遗传测序方法的指导下进行特征选择,基于cfDNA片段组学的机器学习模型在独立验证队列中表现出卓越的性能。这些发现凸显了其作为肺癌早期检测有效非侵入性策略的潜力。
我们的研究阐明了表观遗传修饰与其对片段组学特征的影响之间的调控关系。鉴定表观遗传调控基因为开发基于cfDNA片段组学的机器学习模型提供了关键基础。该模型表现出卓越的临床性能,突出了其在临床实践中转化应用的巨大潜力。