Suppr超能文献

肺结节患者的配对CT图像和血浆游离DNA末端基序的多组学数据集。

A multiomics dataset of paired CT image and plasma cell-free DNA end motif for patients with pulmonary nodules.

作者信息

Zhao Mengmeng, Xue Gang, He Bingxi, Deng Jiajun, Wang Tingting, Zhong Yifan, Li Shenghui, Wang Yang, He Yiming, Chen Tao, Zhang Jun, Yan Ziyue, Hu Xinlei, Guo Liuning, Qu Wendong, Song Yongxiang, Yang Minglei, Zhao Guofang, Yu Bentong, Ma Minjie, Liu Lunxu, Sun Xiwen, Zhao Deping, Xie Dan, Chen Chang, She Yunlang

机构信息

Department of Thoracic Surgery, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China.

Laboratory of Omics Technology and Bioinformatics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, Sichuan, China.

出版信息

Sci Data. 2025 Apr 1;12(1):545. doi: 10.1038/s41597-025-04912-1.

Abstract

Diagnosing lung cancer at a curable stage offers the opportunity for a favorable prognosis. The emerging epigenomics analysis on plasma cell-free DNA (cfDNA), including 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) modifications, has acted as a promising approach facilitating the identification of lung cancer. And, integrating 5mC biomarker with chest computed tomography (CT) image features could optimize the diagnosis of lung cancer, exceeding the performance of models built on single feature. However, the clinical applicability of integrated markers might be limited by the potential risk of overfitting due to small sample size. Hence, we prospectively collected peripheral blood sample and the paired chest CT images of 2032 patients with indeterminate pulmonary nodules across 5 centers, and constructed a large-scale, multi-institutional, multiomics database that encompass CT imaging data and plasma cfDNA fragmentomic in 5mC-, 5hmC-enriched regions. To our best knowledge, this dataset is the first radio-epigenomic dataset with the largest sample size, and provides multi-dimensional insights for early diagnosis of lung cancer, facilitating the individuated management for lung cancer.

摘要

在可治愈阶段诊断肺癌为获得良好预后提供了机会。对血浆游离DNA(cfDNA)进行的新兴表观基因组学分析,包括5-甲基胞嘧啶(5mC)和5-羟甲基胞嘧啶(5hmC)修饰,已成为一种有前景的方法,有助于肺癌的识别。而且,将5mC生物标志物与胸部计算机断层扫描(CT)图像特征相结合可以优化肺癌的诊断,其性能超过基于单一特征构建的模型。然而,由于样本量小,综合标志物的临床适用性可能会受到过度拟合潜在风险的限制。因此,我们前瞻性地收集了来自5个中心的2032例肺结节性质待定患者的外周血样本和配对的胸部CT图像,并构建了一个大规模、多机构、多组学数据库,该数据库包含5mC、5hmC富集区域的CT成像数据和血浆cfDNA片段组学数据。据我们所知,该数据集是首个样本量最大的放射表观基因组数据集,为肺癌的早期诊断提供了多维度见解,有助于肺癌的个体化管理。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fba/11961589/2c53d80b78fd/41597_2025_4912_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验