Arockiaraj Annie I, Liu Dongjing, Shaffer John R, Koleck Theresa A, Crago Elizabeth A, Weeks Daniel E, Conley Yvette P
Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States.
Center for Craniofacial and Dental Genetics, Department of Oral Biology, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, United States.
Front Genet. 2020 Jun 26;11:671. doi: 10.3389/fgene.2020.00671. eCollection 2020.
One challenge in conducting DNA methylation-based epigenome-wide association study (EWAS) is the appropriate cleaning and quality-checking of data to minimize biases and experimental artifacts, while simultaneously retaining potential biological signals. These issues are compounded in studies that include multiple tissue types, and/or tissues for which reference data are unavailable to assist in adjusting for cell-type mixture, for example cerebral spinal fluid (CSF). For our study that evaluated blood and CSF taken from aneurysmal subarachnoid hemorrhage (aSAH) patients, we developed a protocol to clean and quality-check genome-wide methylation levels and compared the methylomic profiles of the two tissues to determine whether blood is a suitable surrogate for CSF. CSF samples were collected from 279 aSAH patients longitudinally during the first 14 days of hospitalization, and a subset of 88 of these patients also provided blood samples within the first 2 days. Quality control (QC) procedures included identification and exclusion of poor performing samples and low-quality probes, functional normalization, and correction for cell-type heterogeneity via surrogate variable analysis (SVA). Significant differences in rates of poor sample performance was observed between blood (1.1% failing QC) and CSF (9.12% failing QC; = 0.003). Functional normalization increased the concordance of methylation values among technical replicates in both CSF and blood. SVA improved the asymptotic behavior of the test of association in a simulated EWAS under the null hypothesis. To determine the suitability of blood as a surrogate for CSF, we calculated the correlations of adjusted methylation values at each CpG between blood and CSF globally and by genomic regions. Overall, mean within-CpG correlation was low ( < 0.26), suggesting that blood is not a suitable surrogate for global methylation in CSF. However, differences in the magnitude of the correlation were observed by genomic region (CpG island, shore, shelf, open sea; < 0.001 for all) and orientation with respect to nearby genes (3' UTR, transcription start site, exon, body, 5' UTR; < 0.01 for all). In conclusion, the correlation analysis and QC pipelines indicated that DNA extracted from blood was not, overall, a suitable surrogate for DNA from CSF in aSAH methylomic studies.
开展基于DNA甲基化的全表观基因组关联研究(EWAS)面临的一个挑战是对数据进行适当的清理和质量检查,以尽量减少偏差和实验假象,同时保留潜在的生物学信号。在包含多种组织类型和/或缺乏参考数据以协助调整细胞类型混合情况的组织(如脑脊液(CSF))的研究中,这些问题更加复杂。对于我们评估动脉瘤性蛛网膜下腔出血(aSAH)患者血液和脑脊液的研究,我们制定了一个方案来清理和质量检查全基因组甲基化水平,并比较这两种组织的甲基化组图谱,以确定血液是否是脑脊液的合适替代物。在住院的前14天内,纵向收集了279例aSAH患者的脑脊液样本,其中88例患者的子集在头2天内也提供了血液样本。质量控制(QC)程序包括识别和排除性能不佳的样本和低质量探针、功能归一化以及通过替代变量分析(SVA)校正细胞类型异质性。在血液(1.1%的样本QC失败)和脑脊液(9.12% 的样本QC失败;P = 0.003)之间观察到样本性能不佳率的显著差异。功能归一化提高了脑脊液和血液中技术重复样本间甲基化值的一致性。在零假设下,SVA改善了模拟EWAS中关联检验的渐近行为。为了确定血液作为脑脊液替代物的适用性,我们计算了血液和脑脊液中每个CpG位点经调整后的甲基化值在整体上以及按基因组区域的相关性。总体而言,每个CpG位点内的平均相关性较低(< 0.26),表明血液不是脑脊液整体甲基化的合适替代物。然而,按基因组区域(CpG岛、岸、架、开阔海域;所有区域P < 0.001)以及相对于附近基因的方向(3'非翻译区、转录起始位点、外显子、基因体、5'非翻译区;所有区域P < 0.01)观察到相关性大小存在差异。总之,相关性分析和QC流程表明,在aSAH甲基化组研究中,从血液中提取的DNA总体上不是脑脊液中DNA的合适替代物。