• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种在组学实验中选择和确认验证靶标的统计方法。

A statistical approach to selecting and confirming validation targets in -omics experiments.

机构信息

Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 615 North Wolfe Street, Baltimore, MD 21205-2179, USA.

出版信息

BMC Bioinformatics. 2012 Jun 27;13:150. doi: 10.1186/1471-2105-13-150.

DOI:10.1186/1471-2105-13-150
PMID:22738145
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3568710/
Abstract

BACKGROUND

Genomic technologies are, by their very nature, designed for hypothesis generation. In some cases, the hypotheses that are generated require that genome scientists confirm findings about specific genes or proteins. But one major advantage of high-throughput technology is that global genetic, genomic, transcriptomic, and proteomic behaviors can be observed. Manual confirmation of every statistically significant genomic result is prohibitively expensive. This has led researchers in genomics to adopt the strategy of confirming only a handful of the most statistically significant results, a small subset chosen for biological interest, or a small random subset. But there is no standard approach for selecting and quantitatively evaluating validation targets.

RESULTS

Here we present a new statistical method and approach for statistically validating lists of significant results based on confirming only a small random sample. We apply our statistical method to show that the usual practice of confirming only the most statistically significant results does not statistically validate result lists. We analyze an extensively validated RNA-sequencing experiment to show that confirming a random subset can statistically validate entire lists of significant results. Finally, we analyze multiple publicly available microarray experiments to show that statistically validating random samples can both (i) provide evidence to confirm long gene lists and (ii) save thousands of dollars and hundreds of hours of labor over manual validation of each significant result.

CONCLUSIONS

For high-throughput -omics studies, statistical validation is a cost-effective and statistically valid approach to confirming lists of significant results.

摘要

背景

基因组学技术本质上是为了生成假设而设计的。在某些情况下,生成的假设需要基因组科学家确认特定基因或蛋白质的发现。但是,高通量技术的一个主要优势是可以观察到全局遗传、基因组、转录组和蛋白质组行为。手动确认每一个具有统计学意义的基因组结果都是非常昂贵的。这导致基因组学研究人员采用了只确认少数具有统计学意义的结果的策略,选择一小部分具有生物学意义的结果,或者选择一小部分随机结果。但是,没有标准的方法来选择和定量评估验证目标。

结果

在这里,我们提出了一种新的统计方法和方法,用于仅通过确认小的随机样本来验证具有统计学意义的结果列表。我们应用我们的统计方法来表明,只确认最具统计学意义的结果的通常做法并不能对结果列表进行统计学验证。我们分析了一个经过广泛验证的 RNA-seq 实验,以表明确认随机子集可以对整个具有统计学意义的结果列表进行统计学验证。最后,我们分析了多个公开的微阵列实验,以表明对随机样本进行统计学验证既可以提供证据来确认长基因列表,又可以节省数千美元和数百小时的人工验证每个显著结果的劳动。

结论

对于高通量的组学研究,统计验证是一种经济有效的方法,可以确认具有统计学意义的结果列表。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8bb/3568710/cb7dbb3c326a/1471-2105-13-150-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8bb/3568710/ec7bde95c1b0/1471-2105-13-150-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8bb/3568710/e0aab46b73b5/1471-2105-13-150-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8bb/3568710/cb7dbb3c326a/1471-2105-13-150-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8bb/3568710/ec7bde95c1b0/1471-2105-13-150-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8bb/3568710/e0aab46b73b5/1471-2105-13-150-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8bb/3568710/cb7dbb3c326a/1471-2105-13-150-3.jpg

相似文献

1
A statistical approach to selecting and confirming validation targets in -omics experiments.一种在组学实验中选择和确认验证靶标的统计方法。
BMC Bioinformatics. 2012 Jun 27;13:150. doi: 10.1186/1471-2105-13-150.
2
Functional genomics and proteomics in the clinical neurosciences: data mining and bioinformatics.临床神经科学中的功能基因组学和蛋白质组学:数据挖掘与生物信息学
Prog Brain Res. 2006;158:83-108. doi: 10.1016/S0079-6123(06)58004-5.
3
Transcription network construction for large-scale microarray datasets using a high-performance computing approach.使用高性能计算方法构建大规模微阵列数据集的转录网络
BMC Genomics. 2008;9 Suppl 1(Suppl 1):S5. doi: 10.1186/1471-2164-9-S1-S5.
4
A methodology for global validation of microarray experiments.一种用于微阵列实验全球验证的方法。
BMC Bioinformatics. 2006 Jul 5;7:333. doi: 10.1186/1471-2105-7-333.
5
Microarray data analysis: a practical approach for selecting differentially expressed genes.微阵列数据分析:一种选择差异表达基因的实用方法。
Genome Biol. 2001;2(12):PREPRINT0009. doi: 10.1186/gb-2001-2-12-preprint0009. Epub 2001 Nov 16.
6
A multi-model statistical approach for proteomic spectral count quantitation.一种用于蛋白质组学光谱计数定量的多模型统计方法。
J Proteomics. 2016 Jul 20;144:23-32. doi: 10.1016/j.jprot.2016.05.032. Epub 2016 May 31.
7
Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification头部损伤的转化代谢组学:基于体外核磁共振波谱的代谢物定量分析探索脑代谢功能障碍
8
Differential Expression and Functional Analysis of High-Throughput -Omics Data Using Open Source Tools.使用开源工具对高通量组学数据进行差异表达分析和功能分析
Methods Mol Biol. 2017;1537:327-345. doi: 10.1007/978-1-4939-6685-1_19.
9
Annotation concept synthesis and enrichment analysis: a logic-based approach to the interpretation of high-throughput experiments.注释概念综合和富集分析:一种基于逻辑的高通量实验解释方法。
Bioinformatics. 2011 Sep 1;27(17):2391-8. doi: 10.1093/bioinformatics/btr337. Epub 2011 Jul 9.
10
Biomarker validation: in situ analysis of protein expression using semiquantitative immunohistochemistry-based techniques.生物标志物验证:使用基于半定量免疫组织化学的技术对蛋白质表达进行原位分析。
Clin Colorectal Cancer. 2008 May;7(3):172-7. doi: 10.3816/CCC.2008.n.022.

引用本文的文献

1
eoPred: predicting the placental phenotype of early-onset preeclampsia using public DNA methylation data.eoPred:利用公开的DNA甲基化数据预测早发型子痫前期的胎盘表型
Front Genet. 2023 Sep 5;14:1248088. doi: 10.3389/fgene.2023.1248088. eCollection 2023.
2
Experimental validation of methods for differential gene expression analysis and sample pooling in RNA-seq.RNA测序中差异基因表达分析和样本合并方法的实验验证
BMC Genomics. 2015 Jul 25;16(1):548. doi: 10.1186/s12864-015-1767-y.
3
Towards decrypting cryptobiosis--analyzing anhydrobiosis in the tardigrade Milnesium tardigradum using transcriptome sequencing.

本文引用的文献

1
Human intuition in the quantitative age. The role of mathematics in biology is vital, but does it leave room for 'old-fashioned' observation and interpretation?定量时代的人类直觉。数学在生物学中的作用至关重要,但这是否为“传统的”观察与解读留出了空间?
EMBO Rep. 2011 May;12(5):401-4. doi: 10.1038/embor.2011.57.
2
Robust RT-qPCR data normalization: validation and selection of internal reference genes during post-experimental data analysis.稳健的 RT-qPCR 数据标准化:实验后数据分析中内参基因的验证和选择。
PLoS One. 2011 Mar 15;6(3):e17762. doi: 10.1371/journal.pone.0017762.
3
Heading down the wrong pathway: on the influence of correlation within gene sets.
朝向破译隐生现象——使用转录组测序分析缓步动物水熊虫的脱水休眠。
PLoS One. 2014 Mar 20;9(3):e92663. doi: 10.1371/journal.pone.0092663. eCollection 2014.
4
Unique transcriptomic signature of omental adipose tissue in Ossabaw swine: a model of childhood obesity.大奥沙瓦猪网膜脂肪组织的独特转录组特征:儿童肥胖模型。
Physiol Genomics. 2014 May 15;46(10):362-75. doi: 10.1152/physiolgenomics.00172.2013. Epub 2014 Mar 18.
5
Testing the utility of an integrated analysis of copy number and transcriptomics datasets for inferring gene regulatory relationships.测试整合拷贝数和转录组数据集分析推断基因调控关系的效用。
PLoS One. 2013 May 30;8(5):e63780. doi: 10.1371/journal.pone.0063780. Print 2013.
误入歧途:基因集内相关性的影响。
BMC Genomics. 2010 Oct 18;11:574. doi: 10.1186/1471-2164-11-574.
4
Tackling the widespread and critical impact of batch effects in high-throughput data.解决高通量数据中广泛存在且极具影响力的批次效应问题。
Nat Rev Genet. 2010 Oct;11(10):733-9. doi: 10.1038/nrg2825. Epub 2010 Sep 14.
5
Cloud-scale RNA-sequencing differential expression analysis with Myrna.利用 Myrna 进行云规模 RNA-seq 差异表达分析。
Genome Biol. 2010;11(8):R83. doi: 10.1186/gb-2010-11-8-r83. Epub 2010 Aug 11.
6
Integrative modeling defines the Nova splicing-regulatory network and its combinatorial controls.综合建模定义了 Nova 剪接调控网络及其组合控制。
Science. 2010 Jul 23;329(5990):439-43. doi: 10.1126/science.1191150. Epub 2010 Jun 17.
7
An economic framework to prioritize confirmatory tests after a high-throughput screen.一种用于在高通量筛选后对确证性试验进行优先级排序的经济框架。
J Biomol Screen. 2010 Jul;15(6):680-6. doi: 10.1177/1087057110372803. Epub 2010 Jun 14.
8
Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments.mRNA-Seq 实验中标准化和差异表达的统计方法评估。
BMC Bioinformatics. 2010 Feb 18;11:94. doi: 10.1186/1471-2105-11-94.
9
Internal validation inferences of significant genomic features in genome-wide screening.全基因组筛选中显著基因组特征的内部验证推断
Comput Stat Data Anal. 2009 Jan 15;53(3):788-800. doi: 10.1016/j.csda.2008.07.004.
10
The transcriptional network for mesenchymal transformation of brain tumours.脑肿瘤间质转化的转录网络。
Nature. 2010 Jan 21;463(7279):318-25. doi: 10.1038/nature08712. Epub 2009 Dec 23.