Suppr超能文献

利用单核苷酸多态性(SNP)的空间分布进行全基因组关联研究的亲缘关系推断。

Inference of kinship using spatial distributions of SNPs for genome-wide association studies.

作者信息

Lee Hyokyeong, Chen Liang

机构信息

Department of Biological Sciences, Molecular and Computational Biology, University of Southern California, Los Angeles, California, 90089, USA.

出版信息

BMC Genomics. 2016 May 20;17:372. doi: 10.1186/s12864-016-2696-0.

Abstract

BACKGROUND

Genome-wide association studies (GWASs) are powerful in identifying genetic loci which cause complex traits of common diseases. However, it is well known that inappropriately accounting for pedigree or population structure leads to spurious associations. GWASs have often encountered increased type I error rates due to the correlated genotypes of cryptically related individuals or subgroups. Therefore, accurate pedigree information is crucial for successful GWASs.

RESULTS

We propose a distance-based method KIND to estimate kinship coefficients among individuals. Our method utilizes the spatial distribution of SNPs in the genome that represents how far each minor-allele variant is located from its neighboring minor-allele variants. The SNP distribution of each individual was presented in a feature vector in Euclidean space, and then the kinship coefficient was inferred from the two vectors of each individual pair. We demonstrate that the distance information can measure the similarity of genetic variants of individuals accurately and efficiently. We applied our method to a synthetic data set and two real data sets (i.e. the HapMap phase III and the 1000 genomes data). We investigated the estimation accuracy of kinship coefficients not only within homogeneous populations but also for a population with extreme stratification.

CONCLUSIONS

Our method KIND usually produces more accurate and more robust kinship coefficient estimates than existing methods especially for populations with extreme stratification. It can serve as an important and very efficient tool for GWASs.

摘要

背景

全基因组关联研究(GWAS)在识别导致常见疾病复杂性状的基因座方面很强大。然而,众所周知,不适当地考虑谱系或群体结构会导致虚假关联。由于隐秘相关个体或亚组的基因型相关,GWAS经常遇到I型错误率增加的情况。因此,准确的谱系信息对于成功的GWAS至关重要。

结果

我们提出了一种基于距离的方法KIND来估计个体间的亲缘系数。我们的方法利用基因组中SNP的空间分布,该分布表示每个次要等位基因变体与其相邻次要等位基因变体的距离。每个个体的SNP分布以欧几里得空间中的特征向量表示,然后从每对个体的两个向量推断亲缘系数。我们证明距离信息可以准确有效地测量个体遗传变异的相似性。我们将我们的方法应用于一个合成数据集和两个真实数据集(即HapMap第三阶段和1000基因组数据)。我们不仅研究了同质群体内亲缘系数的估计准确性,还研究了极端分层群体的估计准确性。

结论

我们的方法KIND通常比现有方法产生更准确、更稳健的亲缘系数估计,特别是对于极端分层的群体。它可以作为GWAS的一个重要且非常有效的工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e4e/4873983/72fc3465b566/12864_2016_2696_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验