• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基因关联的特征值显著性检验

Eigenvalue significance testing for genetic association.

作者信息

Zhou Yi-Hui, Marron J S, Wright Fred A

机构信息

Bioinformatics Research Center and Department of Biological Sciences, North Carolina State University, North Carolina, U.S.A.

Department of Statistics and Operations Research, University of North Carolina, North Carolina, U.S.A.

出版信息

Biometrics. 2018 Jun;74(2):439-447. doi: 10.1111/biom.12767. Epub 2017 Aug 29.

DOI:10.1111/biom.12767
PMID:28853138
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6069632/
Abstract

Genotype eigenvectors are widely used as covariates for control of spurious stratification in genetic association. Significance testing for the accompanying eigenvalues has typically been based on a standard Tracy-Widom limiting distribution for the largest eigenvalue, derived under white-noise assumptions. It is known that even modest local correlation among markers inflates the largest eigenvalues, even in the absence of true stratification. In addition, a few sample eigenvalues may be extreme, creating further complications in accurate testing. We explore several methods to identify appropriate null eigenvalue thresholds, while remaining sensitive to eigenvalues corresponding to population stratification. We introduce a novel block permutation approach, designed to produce an appropriate null eigenvalue distribution by eliminating long-range genomic correlation while preserving local correlation. We also propose a fast approach based on eigenvalue distribution modeling, using a simple fit criterion and the general Marčenko-Pastur equation under a simple discrete eigenvalue model. Block permutation and the model-based approach work well for pure simulations and for data resampled from the 1000 Genomes project. In contrast, we find that the standard approach of computing an "effective" number of markers does not perform well. The performance of the methods is also demonstrated for a motivating example from the International Cystic Fibrosis Consortium.

摘要

基因型特征向量被广泛用作协变量,以控制基因关联中虚假分层的影响。伴随特征值的显著性检验通常基于在白噪声假设下推导得出的最大特征值的标准特雷西 - 威多姆极限分布。众所周知,即使标记之间存在适度的局部相关性,也会使最大特征值膨胀,即使在没有真正分层的情况下也是如此。此外,少数样本特征值可能会非常极端,给准确检验带来更多复杂性。我们探索了几种方法来确定合适的零特征值阈值,同时对与群体分层对应的特征值保持敏感性。我们引入了一种新颖的块置换方法,旨在通过消除长程基因组相关性同时保留局部相关性来产生合适的零特征值分布。我们还基于特征值分布建模提出了一种快速方法,在简单离散特征值模型下使用简单的拟合标准和一般的马尔琴科 - 帕斯图尔方程。块置换和基于模型的方法在纯模拟以及从千人基因组计划重采样的数据中表现良好。相比之下,我们发现计算“有效”标记数的标准方法表现不佳。还通过国际囊性纤维化协会的一个激励性示例展示了这些方法的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4044/6069632/17a1b43859a7/nihms-981683-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4044/6069632/b5a959fea847/nihms-981683-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4044/6069632/2815cafdaca9/nihms-981683-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4044/6069632/086d329b8701/nihms-981683-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4044/6069632/17a1b43859a7/nihms-981683-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4044/6069632/b5a959fea847/nihms-981683-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4044/6069632/2815cafdaca9/nihms-981683-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4044/6069632/086d329b8701/nihms-981683-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4044/6069632/17a1b43859a7/nihms-981683-f0004.jpg

相似文献

1
Eigenvalue significance testing for genetic association.基因关联的特征值显著性检验
Biometrics. 2018 Jun;74(2):439-447. doi: 10.1111/biom.12767. Epub 2017 Aug 29.
2
Accounting for Sampling Error in Genetic Eigenvalues Using Random Matrix Theory.使用随机矩阵理论计算遗传特征值中的抽样误差。
Genetics. 2017 Jul;206(3):1271-1284. doi: 10.1534/genetics.116.198606. Epub 2017 May 5.
3
Dimensionality of genomic information and performance of the Algorithm for Proven and Young for different livestock species.基因组信息的维度以及不同家畜物种的“已证实和年轻个体算法”的性能。
Genet Sel Evol. 2016 Oct 31;48(1):82. doi: 10.1186/s12711-016-0261-6.
4
Tracy-Widom distribution based fault detection approach: application to aircraft sensor/actuator fault detection.基于 Tracy-Widom 分布的故障检测方法:在飞机传感器/执行器故障检测中的应用。
ISA Trans. 2012 Jan;51(1):189-97. doi: 10.1016/j.isatra.2011.07.008. Epub 2011 Aug 19.
5
Computation of ancestry scores with mixed families and unrelated individuals.使用混合家庭和无关个体计算血统分数。
Biometrics. 2018 Mar;74(1):155-164. doi: 10.1111/biom.12708. Epub 2017 Apr 27.
6
Unsupervised gene set testing based on random matrix theory.基于随机矩阵理论的无监督基因集测试
BMC Bioinformatics. 2016 Nov 4;17(1):442. doi: 10.1186/s12859-016-1299-8.
7
An eigenvalue ratio approach to inferring population structure from whole genome sequencing data.一种基于特征值比的方法,用于从全基因组测序数据推断种群结构。
Biometrics. 2023 Jun;79(2):891-902. doi: 10.1111/biom.13691. Epub 2022 May 19.
8
Hypothesis testing at the extremes: fast and robust association for high-throughput data.极端情况下的假设检验:高通量数据的快速稳健关联
Biostatistics. 2015 Jul;16(3):611-25. doi: 10.1093/biostatistics/kxv007. Epub 2015 Mar 18.
9
A functional neural network computing some eigenvalues and eigenvectors of a special real matrix.一个计算特殊实矩阵的某些特征值和特征向量的功能性神经网络。
Neural Netw. 2005 Dec;18(10):1293-300. doi: 10.1016/j.neunet.2005.04.008. Epub 2005 Sep 8.
10
Parametric bootstrap methods for testing multiplicative terms in GGE and AMMI models.用于检验GGE和AMMI模型中乘性项的参数自助法。
Biometrics. 2014 Sep;70(3):639-47. doi: 10.1111/biom.12162. Epub 2014 Mar 3.

引用本文的文献

1
A SIMPLE AND FLEXIBLE TEST OF SAMPLE EXCHANGEABILITY WITH APPLICATIONS TO STATISTICAL GENOMICS.一种简单灵活的样本可交换性检验及其在统计基因组学中的应用
Ann Appl Stat. 2024 Mar;18(1):858-881. doi: 10.1214/23-aoas1817. Epub 2024 Jan 31.

本文引用的文献

1
Computation of ancestry scores with mixed families and unrelated individuals.使用混合家庭和无关个体计算血统分数。
Biometrics. 2018 Mar;74(1):155-164. doi: 10.1111/biom.12708. Epub 2017 Apr 27.
2
A global reference for human genetic variation.人类遗传变异的全球参考。
Nature. 2015 Oct 1;526(7571):68-74. doi: 10.1038/nature15393.
3
Human genomics. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans.人类基因组学。基因型-组织表达(GTEx)试点分析:人类多组织基因调控
Science. 2015 May 8;348(6235):648-60. doi: 10.1126/science.1262110. Epub 2015 May 7.
4
Robust inference of population structure for ancestry prediction and correction of stratification in the presence of relatedness.在存在亲缘关系的情况下,对群体结构进行稳健推断,以进行血统预测和分层校正。
Genet Epidemiol. 2015 May;39(4):276-93. doi: 10.1002/gepi.21896. Epub 2015 Mar 23.
5
Fast principal component analysis of large-scale genome-wide data.大规模全基因组数据的快速主成分分析。
PLoS One. 2014 Apr 9;9(4):e93766. doi: 10.1371/journal.pone.0093766. eCollection 2014.
6
Long range linkage disequilibrium across the human genome.人类基因组中的长程连锁不平衡。
PLoS One. 2013 Dec 12;8(12):e80754. doi: 10.1371/journal.pone.0080754. eCollection 2013.
7
Fine structure of spectral properties for random correlation matrices: an application to financial markets.随机相关矩阵谱特性的精细结构:在金融市场中的应用
Phys Rev E Stat Nonlin Soft Matter Phys. 2011 Jul;84(1 Pt 2):016113. doi: 10.1103/PhysRevE.84.016113. Epub 2011 Jul 29.
8
Quantification of population structure using correlated SNPs by shrinkage principal components.通过收缩主成分利用相关单核苷酸多态性对群体结构进行量化。
Hum Hered. 2010;70(1):9-22. doi: 10.1159/000288706. Epub 2010 Apr 23.
9
The distribution and hypothesis testing of eigenvalues from the canonical analysis of the gamma matrix of quadratic and correlational selection gradients.典范分析二次和相关选择梯度伽马矩阵特征值的分布和假设检验。
Evolution. 2010 Apr 1;64(4):1076-85. doi: 10.1111/j.1558-5646.2009.00874.x. Epub 2009 Oct 23.
10
A unified association analysis approach for family and unrelated samples correcting for stratification.一种针对家族样本和无关样本校正分层的统一关联分析方法。
Am J Hum Genet. 2008 Feb;82(2):352-65. doi: 10.1016/j.ajhg.2007.10.009.