通过识别最佳DNA甲基化文库（IDOL）改善细胞混合物反卷积

Improving cell mixture deconvolution by identifying optimal DNA methylation libraries (IDOL).

作者信息

Koestler Devin C, Jones Meaghan J, Usset Joseph, Christensen Brock C, Butler Rondi A, Kobor Michael S, Wiencke John K, Kelsey Karl T

机构信息

Department of Biostatistics, University of Kansas Medical Center, 3901 Rainbow Blvd., Kansas City, 66160, KS, USA.

Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, Department of Medical Genetics, The University of British Columbia, 950 West 28th Ave., Vancouver, V5Z 4H4, BC, Canada.

出版信息

BMC Bioinformatics. 2016 Mar 8;17:120. doi: 10.1186/s12859-016-0943-7.

DOI:10.1186/s12859-016-0943-7

PMID:26956433

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4782368/

Abstract

BACKGROUND

Confounding due to cellular heterogeneity represents one of the foremost challenges currently facing Epigenome-Wide Association Studies (EWAS). Statistical methods leveraging the tissue-specificity of DNA methylation for deconvoluting the cellular mixture of heterogenous biospecimens offer a promising solution, however the performance of such methods depends entirely on the library of methylation markers being used for deconvolution. Here, we introduce a novel algorithm for Identifying Optimal Libraries (IDOL) that dynamically scans a candidate set of cell-specific methylation markers to find libraries that optimize the accuracy of cell fraction estimates obtained from cell mixture deconvolution.

RESULTS

Application of IDOL to training set consisting of samples with both whole-blood DNA methylation data (Illumina HumanMethylation450 BeadArray (HM450)) and flow cytometry measurements of cell composition revealed an optimized library comprised of 300 CpG sites. When compared existing libraries, the library identified by IDOL demonstrated significantly better overall discrimination of the entire immune cell landscape (p = 0.038), and resulted in improved discrimination of 14 out of the 15 pairs of leukocyte subtypes. Estimates of cell composition across the samples in the training set using the IDOL library were highly correlated with their respective flow cytometry measurements, with all cell-specific R (2)>0.99 and root mean square errors (RMSEs) ranging from [0.97 % to 1.33 %] across leukocyte subtypes. Independent validation of the optimized IDOL library using two additional HM450 data sets showed similarly strong prediction performance, with all cell-specific R (2)>0.90 and R M S E<4.00 %. In simulation studies, adjustments for cell composition using the IDOL library resulted in uniformly lower false positive rates compared to competing libraries, while also demonstrating an improved capacity to explain epigenome-wide variation in DNA methylation within two large publicly available HM450 data sets.

CONCLUSIONS

Despite consisting of half as many CpGs compared to existing libraries for whole blood mixture deconvolution, the optimized IDOL library identified herein resulted in outstanding prediction performance across all considered data sets and demonstrated potential to improve the operating characteristics of EWAS involving adjustments for cell distribution. In addition to providing the EWAS community with an optimized library for whole blood mixture deconvolution, our work establishes a systematic and generalizable framework for the assembly of libraries that improve the accuracy of cell mixture deconvolution.

摘要

背景

由于细胞异质性导致的混杂是目前表观基因组全关联研究（EWAS）面临的首要挑战之一。利用DNA甲基化的组织特异性对异质生物样本的细胞混合物进行反卷积的统计方法提供了一个有前景的解决方案，然而这些方法的性能完全取决于用于反卷积的甲基化标记库。在此，我们介绍一种用于识别最佳库（IDOL）的新算法，该算法动态扫描一组细胞特异性甲基化标记候选物，以找到能够优化从细胞混合物反卷积获得的细胞分数估计准确性的库。

结果

将IDOL应用于由具有全血DNA甲基化数据（Illumina HumanMethylation450 BeadArray（HM450））和细胞组成流式细胞术测量的样本组成的训练集，发现一个由300个CpG位点组成的优化库。与现有库相比，IDOL识别的库在整个免疫细胞图谱的总体区分上表现出显著更好的效果（p = 0.038），并且在15对白细胞亚型中的14对中提高了区分能力。使用IDOL库对训练集中样本的细胞组成估计与各自的流式细胞术测量高度相关，所有细胞特异性R（2）>0.99，白细胞亚型的均方根误差（RMSE）范围为[0.97%至1.33%]。使用另外两个HM450数据集对优化后的IDOL库进行独立验证，显示出同样强大的预测性能，所有细胞特异性R（2）>0.90且RMSE<4.00%。在模拟研究中，与竞争库相比，使用IDOL库对细胞组成进行调整导致假阳性率一致降低，同时还显示出在两个大型公开可用的HM450数据集中解释全基因组DNA甲基化变异的能力有所提高。

结论

尽管与现有的用于全血混合物反卷积的库相比，本文识别的优化IDOL库包含的CpG数量只有一半，但在所有考虑的数据集中都具有出色的预测性能，并显示出改善涉及细胞分布调整的EWAS操作特征的潜力。除了为EWAS社区提供一个用于全血混合物反卷积的优化库之外，我们的工作还建立了一个系统且可推广的库组装框架，以提高细胞混合物反卷积的准确性。

相似文献

Improving cell mixture deconvolution by identifying optimal DNA methylation libraries (IDOL).

BMC Bioinformatics. 2016 Mar 8;17:120. doi: 10.1186/s12859-016-0943-7.

A Novel Framework for the Identification of Reference DNA Methylation Libraries for Reference-Based Deconvolution of Cellular Mixtures.

Front Bioinform. 2022;2. doi: 10.3389/fbinf.2022.835591. Epub 2022 Mar 21.

Enlarged leukocyte referent libraries can explain additional variance in blood-based epigenome-wide association studies.

Epigenomics. 2016 Sep;8(9):1185-92. doi: 10.2217/epi-2016-0037. Epub 2016 Aug 16.

An optimized library for reference-based deconvolution of whole-blood biospecimens assayed using the Illumina HumanMethylationEPIC BeadArray.

Genome Biol. 2018 May 29;19(1):64. doi: 10.1186/s13059-018-1448-7.

Systematic evaluation and validation of reference and library selection methods for deconvolution of cord blood DNA methylation data.

Clin Epigenetics. 2019 Aug 27;11(1):125. doi: 10.1186/s13148-019-0717-y.

Enhanced cell deconvolution of peripheral blood using DNA methylation for high-resolution immune profiling.

Nat Commun. 2022 Feb 9;13(1):761. doi: 10.1038/s41467-021-27864-7.

Blood-based profiles of DNA methylation predict the underlying distribution of cell types: a validation analysis.

Epigenetics. 2013 Aug;8(8):816-26. doi: 10.4161/epi.25430. Epub 2013 Jun 25.

A systematic assessment of cell type deconvolution algorithms for DNA methylation data.

Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac449.

Critical evaluation of the Illumina MethylationEPIC BeadChip microarray for whole-genome DNA methylation profiling.

Genome Biol. 2016 Oct 7;17(1):208. doi: 10.1186/s13059-016-1066-1.

Sparse PCA corrects for cell type heterogeneity in epigenome-wide association studies.

Nat Methods. 2016 May;13(5):443-5. doi: 10.1038/nmeth.3809. Epub 2016 Mar 28.

引用本文的文献

An improved reference library and method for accurate cell-type deconvolution of bulk-tissue miRNA data.

Nat Commun. 2025 Jul 1;16(1):5508. doi: 10.1038/s41467-025-60521-x.

One-carbon metabolism-related compounds are associated with epigenetic aging biomarkers: results from the cross-sectional National Health and Nutrition Examination Survey 1999-2002.

Am J Clin Nutr. 2025 Aug;122(2):413-423. doi: 10.1016/j.ajcnut.2025.05.029. Epub 2025 May 31.

Sex-specific associations between per- and polyfluoroalkyl substance exposure and epigenetic age: Findings from the National Health and Nutrition Examination survey 1999-2000.

Environ Res. 2025 Aug 15;279(Pt 1):121827. doi: 10.1016/j.envres.2025.121827. Epub 2025 May 10.

Influence of race, ethnicity, and sex on the performance of epigenetic predictors of phenotypic traits.

Clin Epigenetics. 2025 Apr 9;17(1):59. doi: 10.1186/s13148-025-01864-6.

Dissecting biological heterogeneity in major depressive disorder based on neuroimaging subtypes with multi-omics data.

Transl Psychiatry. 2025 Mar 4;15(1):72. doi: 10.1038/s41398-025-03286-7.

Examining cellular heterogeneity in human DNA methylation studies: Overview and recommendations.

STAR Protoc. 2025 Mar 21;6(1):103638. doi: 10.1016/j.xpro.2025.103638. Epub 2025 Feb 12.

Exposome-wide association study of environmental chemical exposures and epigenetic aging in the national health and nutrition examination survey.

Aging (Albany NY). 2025 Feb 11;17(2):408-430. doi: 10.18632/aging.206201.

Spatial deconvolution from bulk DNA methylation profiles determines intratumoral epigenetic heterogeneity.

Cell Biosci. 2025 Jan 23;15(1):7. doi: 10.1186/s13578-024-01337-y.

MinLinMo: a minimalist approach to variable selection and linear model prediction.

BMC Bioinformatics. 2024 Dec 18;25(1):380. doi: 10.1186/s12859-024-06000-4.

Association between DNA methylation predicted growth differentiation factor 15 and mortality: results from NHANES 1999-2002.

Aging Clin Exp Res. 2024 Dec 3;36(1):234. doi: 10.1007/s40520-024-02896-3.

本文引用的文献

DNA Methylation in Whole Blood: Uses and Challenges.

Curr Environ Health Rep. 2015 Jun;2(2):145-54. doi: 10.1007/s40572-015-0050-3.

Adjusting for Cell Type Composition in DNA Methylation Data Using a Regression-Based Approach.

Methods Mol Biol. 2017;1589:99-106. doi: 10.1007/7651_2015_262.

Epigenome-wide association study (EWAS) of BMI, BMI change and waist circumference in African American adults identifies multiple replicated loci.

Hum Mol Genet. 2015 Aug 1;24(15):4464-79. doi: 10.1093/hmg/ddv161. Epub 2015 May 1.

An epigenome-wide association study of total serum immunoglobulin E concentration.

Nature. 2015 Apr 30;520(7549):670-674. doi: 10.1038/nature14125. Epub 2015 Feb 18.

DNA methylation age of blood predicts all-cause mortality in later life.

Genome Biol. 2015 Jan 30;16(1):25. doi: 10.1186/s13059-015-0584-6.

The epigenetic clock is correlated with physical and cognitive fitness in the Lothian Birth Cohort 1936.

Int J Epidemiol. 2015 Aug;44(4):1388-96. doi: 10.1093/ije/dyu277. Epub 2015 Jan 22.

Nat Commun. 2014 Nov 18;5:5366. doi: 10.1038/ncomms6366.

Temporal stability and determinants of white blood cell DNA methylation in the breakthrough generations study.

Cancer Epidemiol Biomarkers Prev. 2015 Jan;24(1):221-9. doi: 10.1158/1055-9965.EPI-14-0767. Epub 2014 Nov 4.

Characteristic DNA methylation profiles in peripheral blood monocytes are associated with inflammatory phenotypes of asthma.

Epigenetics. 2014 Sep;9(9):1302-16. doi: 10.4161/epi.33066. Epub 2014 Aug 11.

Grasping nettles: cellular heterogeneity and other confounders in epigenome-wide association studies.

Hum Mol Genet. 2014 Sep 15;23(R1):R83-8. doi: 10.1093/hmg/ddu284. Epub 2014 Jun 13.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过识别最佳DNA甲基化文库（IDOL）改善细胞混合物反卷积

Improving cell mixture deconvolution by identifying optimal DNA methylation libraries (IDOL).

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献