Dynamical Systems, Signal Processing and Data Analytics (STADIUS), Department of Electrical Engineering, KU Leuven, Leuven, 3001, Belgium.
Laboratory for Cytogenetics and Genome Research, Department of Human Genetics, KU Leuven, Leuven, 3000, Belgium.
Bioinformatics. 2024 Sep 2;40(9). doi: 10.1093/bioinformatics/btae522.
Circulating-cell free DNA (cfDNA) is widely explored as a noninvasive biomarker for cancer screening and diagnosis. The ability to decode the cells of origin in cfDNA would provide biological insights into pathophysiological mechanisms, aiding in cancer characterization and directing clinical management and follow-up.
We developed a DNA methylation signature-based deconvolution algorithm, MetDecode, for cancer tissue origin identification. We built a reference atlas exploiting de novo and published whole-genome methylation sequencing data for colorectal, breast, ovarian, and cervical cancer, and blood-cell-derived entities. MetDecode models the contributors absent in the atlas with methylation patterns learnt on-the-fly from the input cfDNA methylation profiles. In addition, our model accounts for the coverage of each marker region to alleviate potential sources of noise. In-silico experiments showed a limit of detection down to 2.88% of tumor tissue contribution in cfDNA. MetDecode produced Pearson correlation coefficients above 0.95 and outperformed other methods in simulations (P < 0.001; T-test; one-sided). In plasma cfDNA profiles from cancer patients, MetDecode assigned the correct tissue-of-origin in 84.2% of cases. In conclusion, MetDecode can unravel alterations in the cfDNA pool components by accurately estimating the contribution of multiple tissues, while supplied with an imperfect reference atlas.
MetDecode is available at https://github.com/JorisVermeeschLab/MetDecode.
循环无细胞游离 DNA(cfDNA)作为癌症筛查和诊断的非侵入性生物标志物得到了广泛的研究。能够对 cfDNA 中起源细胞进行解码,将为生理病理机制提供生物学见解,有助于癌症特征分析,并指导临床管理和随访。
我们开发了一种基于 DNA 甲基化特征的去卷积算法 MetDecode,用于癌症组织起源识别。我们利用从头测序和已发表的全基因组甲基化测序数据,以及血液细胞衍生实体,构建了一个参考图谱。MetDecode 模型缺失的图谱通过从输入 cfDNA 甲基化图谱中实时学习的甲基化模式来构建。此外,我们的模型还考虑了每个标记区域的覆盖范围,以减轻潜在的噪声源。模拟实验表明,在 cfDNA 中,肿瘤组织的检测下限可达 2.88%。MetDecode 产生的 Pearson 相关系数高于 0.95,在模拟中优于其他方法(P<0.001;T 检验;单侧)。在癌症患者的血浆 cfDNA 图谱中,MetDecode 正确分配了 84.2%的组织起源。总之,MetDecode 可以通过准确估计多种组织的贡献,来揭示 cfDNA 池成分的改变,同时提供一个不完美的参考图谱。
MetDecode 可在 https://github.com/JorisVermeeschLab/MetDecode 上获得。