Suppr超能文献

在全基因组研究中检测多重关联。

Detecting multiple associations in genome-wide studies.

作者信息

Dudbridge Frank, Gusnanto Arief, Koeleman Bobby P C

机构信息

MRC Biostatistics Unit, Cambridge, UK.

出版信息

Hum Genomics. 2006 Mar;2(5):310-7. doi: 10.1186/1479-7364-2-5-310.

Abstract

Recent developments in the statistical analysis of genome-wide studies are reviewed. Genome-wide analyses are becoming increasingly common in areas such as scans for disease-associated markers and gene expression profiling. The data generated by these studies present new problems for statistical analysis, owing to the large number of hypothesis tests, comparatively small sample size and modest number of true gene effects. In this review, strategies are described for optimising the genotyping cost by discarding promising genes at an earlier stage, saving resources for the genes that show a trend of association. In addition, there is a review of new methods of analysis that combine evidence across genes to increase sensitivity to multiple true associations in the presence of many non-associated genes. Some methods achieve this by including only the most significant results, whereas others model the overall distribution of results as a mixture of distributions from true and null effects. Because genes are correlated even when having no effect, permutation testing is often necessary to estimate the overall significance, but this can be very time consuming. Efficiency can be improved by fitting a parametric distribution to permutation replicates, which can be re-used in subsequent analyses. Methods are also available to generate random draws from the permutation distribution. The review also includes discussion of new error measures that give a more reasonable interpretation of genome-wide studies, together with improved sensitivity. The false discovery rate allows a controlled proportion of positive results to be false, while detecting more true positives; and the local false discovery rate and false-positive report probability give clarity on whether or not a statistically significant test represents a real discovery.

摘要

本文综述了全基因组研究统计分析的最新进展。全基因组分析在疾病相关标志物扫描和基因表达谱分析等领域正变得越来越普遍。这些研究产生的数据给统计分析带来了新问题,这是由于假设检验数量众多、样本量相对较小以及真正的基因效应数量有限。在本综述中,描述了通过在早期舍弃有前景的基因来优化基因分型成本的策略,从而为显示关联趋势的基因节省资源。此外,还综述了新的分析方法,这些方法整合跨基因的证据,以提高在存在许多非关联基因的情况下对多个真实关联的敏感性。一些方法通过仅纳入最显著的结果来实现这一点,而其他方法则将结果的总体分布建模为真实效应和无效效应分布的混合。由于即使基因没有效应时它们之间也存在相关性,因此通常需要进行置换检验来估计总体显著性,但这可能非常耗时。通过对置换重复拟合参数分布可以提高效率,该分布可在后续分析中重复使用。也有方法可从置换分布中生成随机抽样。本综述还讨论了新的误差度量,这些度量能对全基因组研究给出更合理的解释,同时提高敏感性。错误发现率允许在控制阳性结果中一定比例的错误的同时检测到更多真实阳性;局部错误发现率和假阳性报告概率则明确了具有统计学显著性的检验是否代表真正的发现。

相似文献

1
7
Empirical Bayes screening of many p-values with applications to microarray studies.用于微阵列研究的多p值经验贝叶斯筛选。
Bioinformatics. 2005 May 1;21(9):1987-94. doi: 10.1093/bioinformatics/bti301. Epub 2005 Feb 2.
10

引用本文的文献

本文引用的文献

2
Evaluation of Nyholt's procedure for multiple testing correction.奈霍尔特多重检验校正程序的评估。
Hum Hered. 2005;60(1):19-25; discussion 61-2. doi: 10.1159/000087540. Epub 2005 Aug 23.
3
Why most published research findings are false.为何大多数已发表的研究结果是错误的。
PLoS Med. 2005 Aug;2(8):e124. doi: 10.1371/journal.pmed.0020124. Epub 2005 Aug 30.
4
7
Complement factor H polymorphism in age-related macular degeneration.年龄相关性黄斑变性中的补体因子H多态性
Science. 2005 Apr 15;308(5720):385-9. doi: 10.1126/science.1109557. Epub 2005 Mar 10.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验