Suppr超能文献

一种用于遗传数据分析的无偏亲缘关系估计方法。

An unbiased kinship estimation method for genetic data analysis.

机构信息

Department of Biostatistics, School of Public Health, Yale University, New Haven, USA.

Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Hanover, USA.

出版信息

BMC Bioinformatics. 2022 Dec 6;23(1):525. doi: 10.1186/s12859-022-05082-2.

Abstract

Accurate estimate of relatedness is important for genetic data analyses, such as heritability estimation and association mapping based on data collected from genome-wide association studies. Inaccurate relatedness estimates may lead to biased heritability estimations and spurious associations. Individual-level genotype data are often used to estimate kinship coefficient between individuals. The commonly used sample correlation-based genomic relationship matrix (scGRM) method estimates kinship coefficient by calculating the average sample correlation coefficient among all single nucleotide polymorphisms (SNPs), where the observed allele frequencies are used to calculate both the expectations and variances of genotypes. Although this method is widely used, a substantial proportion of estimated kinship coefficients are negative, which are difficult to interpret. In this paper, through mathematical derivation, we show that there indeed exists bias in the estimated kinship coefficient using the scGRM method when the observed allele frequencies are regarded as true frequencies. This leads to negative bias for the average estimate of kinship among all individuals, which explains the estimated negative kinship coefficients. Based on this observation, we propose an unbiased estimation method, UKin, which can reduce kinship estimation bias. We justify our improved method with rigorous mathematical proof. We have conducted simulations as well as two real data analyses to compare UKin with scGRM and three other kinship estimating methods: rGRM, tsGRM, and KING. Our results demonstrate that both bias and root mean square error in kinship coefficient estimation could be reduced by using UKin. We further investigated the performance of UKin, KING, and three GRM-based methods in calculating the SNP-based heritability, and show that UKin can improve estimation accuracy for heritability regardless of the scale of SNP panel.

摘要

准确估计亲缘关系对于遗传数据分析非常重要,例如基于全基因组关联研究中收集的数据进行遗传力估计和关联映射。不准确的亲缘关系估计可能导致遗传力估计偏差和虚假关联。个体水平的基因型数据通常用于估计个体之间的亲缘系数。常用的基于样本相关的基因组关系矩阵(scGRM)方法通过计算所有单核苷酸多态性(SNP)之间的平均样本相关系数来估计亲缘系数,其中观察到的等位基因频率用于计算基因型的期望和方差。尽管这种方法被广泛使用,但很大一部分估计的亲缘系数是负的,这很难解释。在本文中,我们通过数学推导表明,当观察到的等位基因频率被视为真实频率时,scGRM 方法估计的亲缘系数确实存在偏差。这导致所有个体之间亲缘关系的平均估计存在负偏差,这解释了估计的负亲缘系数。基于这一观察结果,我们提出了一种无偏估计方法 UKin,可以减少亲缘关系估计的偏差。我们用严格的数学证明来证明我们改进的方法是合理的。我们进行了模拟以及两项真实数据分析,以比较 UKin 与 scGRM 和其他三种亲缘关系估计方法:rGRM、tsGRM 和 KING。我们的结果表明,使用 UKin 可以减少亲缘系数估计中的偏差和均方根误差。我们进一步研究了 UKin、KING 和三种基于 GRM 的方法在计算基于 SNP 的遗传力方面的性能,并表明 UKin 可以提高遗传力估计的准确性,而与 SNP 面板的规模无关。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663c/9727941/2620136ac395/12859_2022_5082_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验