Suppr超能文献

MethylGenotyper:从 DNA 甲基化数据中准确估计 SNP 基因型和遗传关系。

MethylGenotyper: Accurate Estimation of SNP Genotypes and Genetic Relatedness from DNA Methylation Data.

机构信息

Ministry of Education Key Laboratory of Environment and Health, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China.

Department of Epidemiology and Biostatistics, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China.

出版信息

Genomics Proteomics Bioinformatics. 2024 Sep 13;22(3). doi: 10.1093/gpbjnl/qzae044.

Abstract

Epigenome-wide association studies (EWAS) are susceptible to widespread confounding caused by population structure and genetic relatedness. Nevertheless, kinship estimation is challenging in EWAS without genotyping data. Here, we proposed MethylGenotyper, a method that for the first time enables accurate genotyping at thousands of single nucleotide polymorphisms (SNPs) directly from commercial DNA methylation microarrays. We modeled the intensities of methylation probes near SNPs with a mixture of three beta distributions corresponding to different genotypes and estimated parameters with an expectation-maximization algorithm. We conducted extensive simulations to demonstrate the performance of the method. When applying MethylGenotyper to the Infinium EPIC array data of 4662 Chinese samples, we obtained genotypes at 4319 SNPs with a concordance rate of 98.26%, enabling the identification of 255 pairs of close relatedness. Furthermore, we showed that MethylGenotyper allows for the estimation of both population structure and cryptic relatedness among 702 Australians of diverse ancestry. We also implemented MethylGenotyper in a publicly available R package (https://github.com/Yi-Jiang/MethylGenotyper) to facilitate future large-scale EWAS.

摘要

全基因组关联研究(EWAS)易受由群体结构和遗传相关性引起的广泛混杂因素的影响。然而,在没有基因分型数据的情况下,EWAS 中的亲缘关系估计具有挑战性。在这里,我们提出了 MethylGenotyper,这是一种首次能够从商业 DNA 甲基化微阵列中直接对数千个单核苷酸多态性(SNP)进行精确基因分型的方法。我们使用对应于不同基因型的三个β分布的混合物来对 SNP 附近的甲基化探针的强度进行建模,并使用期望最大化算法来估计参数。我们进行了广泛的模拟以证明该方法的性能。当将 MethylGenotyper 应用于 4662 名中国样本的 Infinium EPIC 阵列数据时,我们获得了 4319 个 SNP 的基因型,其一致性率为 98.26%,从而能够鉴定出 255 对近亲关系。此外,我们表明 MethylGenotyper 允许对具有不同祖先的 702 名澳大利亚人进行群体结构和隐匿亲缘关系的估计。我们还在一个公开可用的 R 包(https://github.com/Yi-Jiang/MethylGenotyper)中实现了 MethylGenotyper,以方便未来的大规模 EWAS。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验