Suppr超能文献

系统评估和验证用于解析脐带血 DNA 甲基化数据的参考和文库选择方法。

Systematic evaluation and validation of reference and library selection methods for deconvolution of cord blood DNA methylation data.

机构信息

Pharmacoepidemiology and Drug Safety Research Group, Department of Pharmacy, School of Pharmacy, University of Oslo, Oslo, Norway.

PharmaTox Strategic Research Initiative, Faculty of Mathematics and Natural Sciences, University of Oslo, Oslo, Norway.

出版信息

Clin Epigenetics. 2019 Aug 27;11(1):125. doi: 10.1186/s13148-019-0717-y.

Abstract

BACKGROUND

Umbilical cord blood (UCB) is commonly used in epigenome-wide association studies of prenatal exposures. Accounting for cell type composition is critical in such studies as it reduces confounding due to the cell specificity of DNA methylation (DNAm). In the absence of cell sorting information, statistical methods can be applied to deconvolve heterogeneous cell mixtures. Among these methods, reference-based approaches leverage age-appropriate cell-specific DNAm profiles to estimate cellular composition. In UCB, four reference datasets comprising DNAm signatures profiled in purified cell populations have been published using the Illumina 450 K and EPIC arrays. These datasets are biologically and technically different, and currently, there is no consensus on how to best apply them. Here, we systematically evaluate and compare these datasets and provide recommendations for reference-based UCB deconvolution.

RESULTS

We first evaluated the four reference datasets to ascertain both the purity of the samples and the potential cell cross-contamination. We filtered samples and combined datasets to obtain a joint UCB reference. We selected deconvolution libraries using two different approaches: automatic selection using the top differentially methylated probes from the function pickCompProbes in minfi and a standardized library selected using the IDOL (Identifying Optimal Libraries) iterative algorithm. We compared the performance of each reference separately and in combination, using the two approaches for reference library selection, and validated the results in an independent cohort (Generation R Study, n = 191) with matched Fluorescence-Activated Cell Sorting measured cell counts. Strict filtering and combination of the references significantly improved the accuracy and efficiency of cell type estimates. Ultimately, the IDOL library outperformed the library from the automatic selection method implemented in pickCompProbes.

CONCLUSION

These results have important implications for epigenetic studies in UCB as implementing this method will optimally reduce confounding due to cellular heterogeneity. This work provides guidelines for future reference-based UCB deconvolution and establishes a framework for combining reference datasets in other tissues.

摘要

背景

脐带血(UCB)常用于产前暴露的全基因组关联研究中的表观基因组学研究。在这些研究中,考虑细胞类型组成至关重要,因为 DNA 甲基化(DNAm)具有细胞特异性,这会导致混淆。在没有细胞分选信息的情况下,可以应用统计方法来分解异质细胞混合物。在这些方法中,基于参考的方法利用与年龄相关的特定细胞的 DNAm 特征来估计细胞组成。在 UCB 中,已经使用 Illumina 450K 和 EPIC 阵列发布了四个包含在纯化细胞群中进行 DNAm 特征分析的参考数据集。这些数据集在生物学和技术上有所不同,目前尚无关于如何最好地应用它们的共识。在这里,我们系统地评估和比较了这些数据集,并为基于参考的 UCB 去卷积提供了建议。

结果

我们首先评估了四个参考数据集,以确定样本的纯度和潜在的细胞交叉污染。我们过滤了样本并合并了数据集以获得联合 UCB 参考。我们使用两种不同的方法选择去卷积库:使用 minfi 中的函数 pickCompProbes 从差异甲基化探针中自动选择,以及使用 IDOL(识别最优库)迭代算法选择标准化库。我们比较了每种参考单独和组合使用两种方法进行参考库选择的性能,并在具有匹配的荧光激活细胞分选测量的细胞计数的独立队列(Generation R 研究,n=191)中验证了结果。严格的过滤和参考的组合显著提高了细胞类型估计的准确性和效率。最终,IDOL 库的性能优于 pickCompProbes 中实现的自动选择方法的库。

结论

这些结果对 UCB 中的表观遗传学研究具有重要意义,因为实施这种方法将最佳减少细胞异质性引起的混淆。这项工作为未来基于参考的 UCB 去卷积提供了指导,并为在其他组织中组合参考数据集建立了框架。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a447/6712867/631adbc421a5/13148_2019_717_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验