一种在组学实验中选择和确认验证靶标的统计方法。

A statistical approach to selecting and confirming validation targets in -omics experiments.

机构信息

Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 615 North Wolfe Street, Baltimore, MD 21205-2179, USA.

出版信息

BMC Bioinformatics. 2012 Jun 27;13:150. doi: 10.1186/1471-2105-13-150.

DOI:10.1186/1471-2105-13-150

PMID:22738145

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3568710/

Abstract

BACKGROUND

Genomic technologies are, by their very nature, designed for hypothesis generation. In some cases, the hypotheses that are generated require that genome scientists confirm findings about specific genes or proteins. But one major advantage of high-throughput technology is that global genetic, genomic, transcriptomic, and proteomic behaviors can be observed. Manual confirmation of every statistically significant genomic result is prohibitively expensive. This has led researchers in genomics to adopt the strategy of confirming only a handful of the most statistically significant results, a small subset chosen for biological interest, or a small random subset. But there is no standard approach for selecting and quantitatively evaluating validation targets.

RESULTS

Here we present a new statistical method and approach for statistically validating lists of significant results based on confirming only a small random sample. We apply our statistical method to show that the usual practice of confirming only the most statistically significant results does not statistically validate result lists. We analyze an extensively validated RNA-sequencing experiment to show that confirming a random subset can statistically validate entire lists of significant results. Finally, we analyze multiple publicly available microarray experiments to show that statistically validating random samples can both (i) provide evidence to confirm long gene lists and (ii) save thousands of dollars and hundreds of hours of labor over manual validation of each significant result.

CONCLUSIONS

For high-throughput -omics studies, statistical validation is a cost-effective and statistically valid approach to confirming lists of significant results.

摘要

背景

基因组学技术本质上是为了生成假设而设计的。在某些情况下，生成的假设需要基因组科学家确认特定基因或蛋白质的发现。但是，高通量技术的一个主要优势是可以观察到全局遗传、基因组、转录组和蛋白质组行为。手动确认每一个具有统计学意义的基因组结果都是非常昂贵的。这导致基因组学研究人员采用了只确认少数具有统计学意义的结果的策略，选择一小部分具有生物学意义的结果，或者选择一小部分随机结果。但是，没有标准的方法来选择和定量评估验证目标。

结果

在这里，我们提出了一种新的统计方法和方法，用于仅通过确认小的随机样本来验证具有统计学意义的结果列表。我们应用我们的统计方法来表明，只确认最具统计学意义的结果的通常做法并不能对结果列表进行统计学验证。我们分析了一个经过广泛验证的 RNA-seq 实验，以表明确认随机子集可以对整个具有统计学意义的结果列表进行统计学验证。最后，我们分析了多个公开的微阵列实验，以表明对随机样本进行统计学验证既可以提供证据来确认长基因列表，又可以节省数千美元和数百小时的人工验证每个显著结果的劳动。

结论

对于高通量的组学研究，统计验证是一种经济有效的方法，可以确认具有统计学意义的结果列表。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

一种在组学实验中选择和确认验证靶标的统计方法。

A statistical approach to selecting and confirming validation targets in -omics experiments.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

一种在组学实验中选择和确认验证靶标的统计方法。

A statistical approach to selecting and confirming validation targets in -omics experiments.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献