Rahmani Elior, Shenhav Liat, Schweiger Regev, Yousefi Paul, Huen Karen, Eskenazi Brenda, Eng Celeste, Huntsman Scott, Hu Donglei, Galanter Joshua, Oh Sam S, Waldenberger Melanie, Strauch Konstantin, Grallert Harald, Meitinger Thomas, Gieger Christian, Holland Nina, Burchard Esteban G, Zaitlen Noah, Halperin Eran
Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel.
Department of Statistics, Tel Aviv University, Tel Aviv, Israel.
Epigenetics Chromatin. 2017 Jan 3;10:1. doi: 10.1186/s13072-016-0108-y. eCollection 2017.
Genetic data are known to harbor information about human demographics, and genotyping data are commonly used for capturing ancestry information by leveraging genome-wide differences between populations. In contrast, it is not clear to what extent population structure is captured by whole-genome DNA methylation data.
We demonstrate, using three large-cohort 450K methylation array data sets, that ancestry information signal is mirrored in genome-wide DNA methylation data and that it can be further isolated more effectively by leveraging the correlation structure of CpGs with -located SNPs. Based on these insights, we propose a method, EPISTRUCTURE, for the inference of ancestry from methylation data, without the need for genotype data.
EPISTRUCTURE can be used to infer ancestry information of individuals based on their methylation data in the absence of corresponding genetic data. Although genetic data are often collected in epigenetic studies of large cohorts, these are typically not made publicly available, making the application of EPISTRUCTURE especially useful for anyone working on public data. Implementation of EPISTRUCTURE is available in GLINT, our recently released toolset for DNA methylation analysis at: http://glint-epigenetics.readthedocs.io.
已知遗传数据蕴含有关人类人口统计学的信息,基因分型数据通常用于通过利用不同人群间的全基因组差异来获取祖先信息。相比之下,全基因组DNA甲基化数据在多大程度上能够反映人群结构尚不清楚。
我们使用三个大型队列的450K甲基化阵列数据集证明,祖先信息信号在全基因组DNA甲基化数据中有所体现,并且通过利用与位于CpG附近的单核苷酸多态性(SNPs)的相关性结构,可以更有效地进一步分离该信号。基于这些见解,我们提出了一种方法EPISTRUCTURE,用于从甲基化数据推断祖先信息,而无需基因型数据。
EPISTRUCTURE可用于在没有相应遗传数据的情况下,根据个体的甲基化数据推断其祖先信息。尽管在大型队列的表观遗传学研究中通常会收集遗传数据,但这些数据通常不会公开,这使得EPISTRUCTURE对于处理公共数据的任何人都特别有用。EPISTRUCTURE的实现可在我们最近发布的用于DNA甲基化分析的工具集GLINT中获取,网址为:http://glint-epigenetics.readthedocs.io 。