• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Gene expression analysis with the parametric bootstrap.

作者信息

van der Laan M J, Bryan J

机构信息

Division of Biostatistics, University of California, Earl Warren Hall 7360, Berkeley, CA 94720-7360, USA.

出版信息

Biostatistics. 2001 Dec;2(4):445-61. doi: 10.1093/biostatistics/2.4.445.

DOI:10.1093/biostatistics/2.4.445
PMID:12933635
Abstract

Recent developments in microarray technology make it possible to capture the gene expression profiles for thousands of genes at once. With this data researchers are tackling problems ranging from the identification of 'cancer genes' to the formidable task of adding functional annotations to our rapidly growing gene databases. Specific research questions suggest patterns of gene expression that are interesting and informative: for instance, genes with large variance or groups of genes that are highly correlated. Cluster analysis and related techniques are proving to be very useful. However, such exploratory methods alone do not provide the opportunity to engage in statistical inference. Given the high dimensionality (thousands of genes) and small sample sizes (often <30) encountered in these datasets, an honest assessment of sampling variability is crucial and can prevent the over-interpretation of spurious results. We describe a statistical framework that encompasses many of the analytical goals in gene expression analysis; our framework is completely compatible with many of the current approaches and, in fact, can increase their utility. We propose the use of a deterministic rule, applied to the parameters of the gene expression distribution, to select a target subset of genes that are of biological interest. In addition to subset membership, the target subset can include information about relationships between genes, such as clustering. This target subset presents an interesting parameter that we can estimate by applying the rule to the sample statistics of microarray data. The parametric bootstrap, based on a multivariate normal model, is used to estimate the distribution of these estimated subsets and relevant summary measures of this sampling distribution are proposed. We focus on rules that operate on the mean and covariance. Using Bernstein's Inequality, we obtain consistency of the subset estimates, under the assumption that the sample size converges faster to infinity than the logarithm of the number of genes. We also provide a conservative sample size formula guaranteeing that the sample mean and sample covariance matrix are uniformly within a distance epsilon > 0 of the population mean and covariance. The practical performance of the method using a cluster-based subset rule is illustrated with a simulation study. The method is illustrated with an analysis of a publicly available leukemia data set.

摘要

相似文献

1
Gene expression analysis with the parametric bootstrap.
Biostatistics. 2001 Dec;2(4):445-61. doi: 10.1093/biostatistics/2.4.445.
2
Cross-platform comparison and visualisation of gene expression data using co-inertia analysis.使用共惯性分析对基因表达数据进行跨平台比较和可视化
BMC Bioinformatics. 2003 Nov 21;4:59. doi: 10.1186/1471-2105-4-59.
3
Clustering of change patterns using Fourier coefficients.使用傅里叶系数对变化模式进行聚类。
Bioinformatics. 2008 Jan 15;24(2):184-91. doi: 10.1093/bioinformatics/btm568. Epub 2007 Nov 19.
4
GeneTools--application for functional annotation and statistical hypothesis testing.基因工具——用于功能注释和统计假设检验的应用程序。
BMC Bioinformatics. 2006 Oct 24;7:470. doi: 10.1186/1471-2105-7-470.
5
Multivariate exploratory tools for microarray data analysis.用于微阵列数据分析的多变量探索工具。
Biostatistics. 2003 Oct;4(4):555-67. doi: 10.1093/biostatistics/4.4.555.
6
Challenges in projecting clustering results across gene expression-profiling datasets.跨基因表达谱数据集预测聚类结果面临的挑战。
J Natl Cancer Inst. 2007 Nov 21;99(22):1715-23. doi: 10.1093/jnci/djm216. Epub 2007 Nov 13.
7
A new efficient statistical test for detecting variability in the gene expression data.一种用于检测基因表达数据变异性的新型高效统计检验方法。
Stat Methods Med Res. 2008 Aug;17(4):405-19. doi: 10.1177/0962280206078643. Epub 2007 Aug 14.
8
Semi-parametric differential expression analysis via partial mixture estimation.通过部分混合估计进行半参数差异表达分析。
Stat Appl Genet Mol Biol. 2008;7(1):Article15. doi: 10.2202/1544-6115.1333. Epub 2008 Apr 28.
9
Detecting clusters of different geometrical shapes in microarray gene expression data.在微阵列基因表达数据中检测不同几何形状的聚类。
Bioinformatics. 2005 May 1;21(9):1927-34. doi: 10.1093/bioinformatics/bti251. Epub 2005 Jan 12.
10
Microarray gene cluster identification and annotation through cluster ensemble and EM-based informative textual summarization.通过聚类集成和基于期望最大化的信息文本摘要进行微阵列基因簇识别与注释。
IEEE Trans Inf Technol Biomed. 2009 Sep;13(5):832-40. doi: 10.1109/TITB.2009.2023984. Epub 2009 Jun 12.

引用本文的文献

1
BALL DIVERGENCE: NONPARAMETRIC TWO SAMPLE TEST.球形散度:非参数双样本检验
Ann Stat. 2018 Jun;46(3):1109-1137. doi: 10.1214/17-AOS1579.
2
Inference for multimarker adaptive enrichment trials.多标记自适应富集试验的推断
Stat Med. 2017 Nov 20;36(26):4083-4093. doi: 10.1002/sim.7422. Epub 2017 Aug 10.
3
DISCUSSION OF: TREELETS-AN ADAPTIVE MULTI-SCALE BASIS FOR SPARSE UNORDERED DATA.关于“Treelets——稀疏无序数据的自适应多尺度基”的讨论
Ann Appl Stat. 2008 Jun;2(2):489-493. doi: 10.1214/07-AOAS137.
4
Meta-analysis of differentiating mouse embryonic stem cell gene expression kinetics reveals early change of a small gene set.对区分小鼠胚胎干细胞基因表达动力学的荟萃分析揭示了一个小基因集的早期变化。
PLoS Comput Biol. 2006 Nov 24;2(11):e158. doi: 10.1371/journal.pcbi.0020158.
5
Transcriptional co-regulation of secondary metabolism enzymes in Arabidopsis: functional and evolutionary implications.拟南芥中次生代谢酶的转录共调控:功能及进化意义
Plant Mol Biol. 2005 May;58(2):229-45. doi: 10.1007/s11103-005-5346-5.