Suppr超能文献

大规模基因研究中的多重检验。

Multiple testing in large-scale genetic studies.

作者信息

Bouaziz Matthieu, Jeanmougin Marine, Guedj Mickaël

机构信息

Department of Biostatistics, Pharnext, Paris, France.

出版信息

Methods Mol Biol. 2012;888:213-33. doi: 10.1007/978-1-61779-870-2_13.

Abstract

Recent advances in Molecular Biology and improvements in microarray and sequencing technologies have led biologists toward high-throughput genomic studies. These studies aim at finding associations between genetic markers and a phenotype and involve conducting many statistical tests on these markers. Such Please confirm the changes in the sentence "Such a wide..." a wide investigation of the genome not only renders genomic studies quite attractive but also lead to a major shortcoming. That is, among the markers detected as associated with the phenotype, a nonnegligible proportion is not in reality (false-positives) and also true associations can be missed (false-negatives). A main cause of these spurious associations is due to the multiple-testing problem, inherent to conducting numerous statistical tests. Several approaches exist to work around this issue. These multiple-testing adjustments aim at defining new statistical confidence measures that are controlled to guarantee that the outcomes of the tests are pertinent.The most natural correction was introduced by Bonferroni and aims at controlling the family-wise error-rate (FWER) that is the probability of having at least one false-positive. Another approach is based on the false-discovery-rate (FDR) and considers the proportion of significant results that are expected to be false-positives. Finally, the local-FDR focuses on the actual probability for a marker of being associated or not with the phenotype. These strategies are widely used but one has to be careful about when and how to apply them. We propose in this chapter a discussion on the multiple-testing issue and on the main approaches to take it into account. We aim at providing a theoretical and intuitive definition of these concepts along with practical advises to guide researchers in choosing the more appropriate multiple-testing procedure corresponding to the purposes of their studies.

摘要

分子生物学的最新进展以及微阵列和测序技术的改进,已引导生物学家开展高通量基因组研究。这些研究旨在寻找遗传标记与表型之间的关联,并涉及对这些标记进行大量统计测试。对基因组进行如此广泛的研究不仅使基因组研究颇具吸引力,而且还导致了一个主要缺点。也就是说,在检测到与表型相关的标记中,有不可忽视的一部分实际上并非真正相关(假阳性),而且真正的关联也可能被遗漏(假阴性)。这些虚假关联的一个主要原因是多重检验问题,这是进行大量统计测试所固有的。有几种方法可以解决这个问题。这些多重检验调整旨在定义新的统计置信度度量,这些度量经过控制以确保测试结果是相关的。最自然的校正方法是由邦费罗尼提出的,旨在控制族系错误率(FWER),即至少有一个假阳性的概率。另一种方法基于错误发现率(FDR),并考虑预期为假阳性的显著结果的比例。最后,局部错误发现率关注某个标记与表型相关或不相关的实际概率。这些策略被广泛使用,但必须注意何时以及如何应用它们。在本章中,我们提议对多重检验问题以及考虑该问题的主要方法进行讨论。我们旨在提供这些概念的理论和直观定义,并给出实用建议,以指导研究人员根据其研究目的选择更合适的多重检验程序。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验