Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA.
Department of Neurology, University of California Los Angeles, Los Angeles, CA, USA.
Nat Commun. 2021 May 11;12(1):2717. doi: 10.1038/s41467-021-22901-x.
Circulating cell-free DNA (cfDNA) in the bloodstream originates from dying cells and is a promising noninvasive biomarker for cell death. Here, we propose an algorithm, CelFiE, to accurately estimate the relative abundances of cell types and tissues contributing to cfDNA from epigenetic cfDNA sequencing. In contrast to previous work, CelFiE accommodates low coverage data, does not require CpG site curation, and estimates contributions from multiple unknown cell types that are not available in external reference data. In simulations, CelFiE accurately estimates known and unknown cell type proportions from low coverage and noisy cfDNA mixtures, including from cell types composing less than 1% of the total mixture. When used in two clinically-relevant situations, CelFiE correctly estimates a large placenta component in pregnant women, and an elevated skeletal muscle component in amyotrophic lateral sclerosis (ALS) patients, consistent with the occurrence of muscle wasting typical in these patients. Together, these results show how CelFiE could be a useful tool for biomarker discovery and monitoring the progression of degenerative disease.
循环细胞游离 DNA(cfDNA)存在于血液中,来源于死亡细胞,是一种有前途的非侵入性细胞死亡生物标志物。在这里,我们提出了一种算法 CelFiE,用于从表观遗传 cfDNA 测序中准确估计对 cfDNA 有贡献的细胞类型和组织的相对丰度。与以前的工作相比,CelFiE 可以处理低覆盖度数据,不需要 CpG 位点编辑,并且可以估计来自多个外部参考数据中不可用的未知细胞类型的贡献。在模拟中,CelFiE 可以从低覆盖度和嘈杂的 cfDNA 混合物中准确估计已知和未知的细胞类型比例,包括总混合物中占比小于 1%的细胞类型。在两种临床相关情况下使用时,CelFiE 正确估计了孕妇中大量胎盘成分,以及肌萎缩侧索硬化症(ALS)患者中升高的骨骼肌成分,与这些患者中典型的肌肉消耗情况一致。总之,这些结果表明 CelFiE 如何成为生物标志物发现和监测退行性疾病进展的有用工具。