微阵列研究中的功效和样本量估计。

Power and sample size estimation in microarray studies.

机构信息

Division of Personalized Nutrition and Medicine, National Center for Toxicological Research, FDA, Jefferson, AR 72079, USA.

出版信息

BMC Bioinformatics. 2010 Jan 25;11:48. doi: 10.1186/1471-2105-11-48.

DOI:10.1186/1471-2105-11-48

PMID:20100337

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2837028/

Abstract

BACKGROUND

Before conducting a microarray experiment, one important issue that needs to be determined is the number of arrays required in order to have adequate power to identify differentially expressed genes. This paper discusses some crucial issues in the problem formulation, parameter specifications, and approaches that are commonly proposed for sample size estimation in microarray experiments. Common methods for sample size estimation are formulated as the minimum sample size necessary to achieve a specified sensitivity (proportion of detected truly differentially expressed genes) on average at a specified false discovery rate (FDR) level and specified expected proportion (pi1) of the true differentially expression genes in the array. Unfortunately, the probability of detecting the specified sensitivity in such a formulation can be low. We formulate the sample size problem as the number of arrays needed to achieve a specified sensitivity with 95% probability at the specified significance level. A permutation method using a small pilot dataset to estimate sample size is proposed. This method accounts for correlation and effect size heterogeneity among genes.

RESULTS

A sample size estimate based on the common formulation, to achieve the desired sensitivity on average, can be calculated using a univariate method without taking the correlation among genes into consideration. This formulation of sample size problem is inadequate because the probability of detecting the specified sensitivity can be lower than 50%. On the other hand, the needed sample size calculated by the proposed permutation method will ensure detecting at least the desired sensitivity with 95% probability. The method is shown to perform well for a real example dataset using a small pilot dataset with 4-6 samples per group.

CONCLUSIONS

We recommend that the sample size problem should be formulated to detect a specified proportion of differentially expressed genes with 95% probability. This formulation ensures finding the desired proportion of true positives with high probability. The proposed permutation method takes the correlation structure and effect size heterogeneity into consideration and works well using only a small pilot dataset.

摘要

背景

在进行微阵列实验之前，需要确定的一个重要问题是为了有足够的能力来识别差异表达基因，需要进行多少个阵列。本文讨论了在微阵列实验中样本量估计问题的公式化、参数规范和常用方法中一些关键问题。常见的样本量估计方法被公式化为在指定的错误发现率 (FDR) 水平和指定的真差异表达基因的预期比例 (pi1) 下，平均达到指定灵敏度（检测到的真正差异表达基因的比例）所需的最小样本量。不幸的是，在这种公式化中，检测到指定灵敏度的概率可能很低。我们将样本量问题公式化为在指定的显著水平下以 95%的概率达到指定灵敏度所需的阵列数量。提出了一种使用小的试验数据集进行估计的排列方法。该方法考虑了基因之间的相关性和效应大小异质性。

结果

基于常见公式，为了平均达到所需的灵敏度，可以使用不考虑基因之间相关性的单变量方法计算样本量估计值。这种样本量问题的公式化是不充分的，因为检测到指定灵敏度的概率可能低于 50%。另一方面，通过提议的排列方法计算出的所需样本量将确保以 95%的概率至少检测到所需的灵敏度。该方法在使用具有 4-6 个样本/组的小试验数据集的真实示例数据集上表现良好。

结论

我们建议将样本量问题公式化为以 95%的概率检测到指定比例的差异表达基因。这种公式化确保以高概率找到所需比例的真正阳性。提议的排列方法考虑了相关性结构和效应大小异质性，并且仅使用小的试验数据集即可很好地工作。

相似文献

Power and sample size estimation in microarray studies.微阵列研究中的功效和样本量估计。

BMC Bioinformatics. 2010 Jan 25;11:48. doi: 10.1186/1471-2105-11-48.

Sample size for identifying differentially expressed genes in microarray experiments.微阵列实验中用于鉴定差异表达基因的样本量。

J Comput Biol. 2004;11(4):714-26. doi: 10.1089/cmb.2004.11.714.

Sample size for FDR-control in microarray data analysis.微阵列数据分析中用于错误发现率控制的样本量。

Bioinformatics. 2005 Jul 15;21(14):3097-104. doi: 10.1093/bioinformatics/bti456. Epub 2005 Apr 21.

Sample size for gene expression microarray experiments.基因表达微阵列实验的样本量

Bioinformatics. 2005 Apr 15;21(8):1502-8. doi: 10.1093/bioinformatics/bti162. Epub 2004 Nov 25.

False discovery rate, sensitivity and sample size for microarray studies.微阵列研究的错误发现率、敏感性和样本量

Bioinformatics. 2005 Jul 1;21(13):3017-24. doi: 10.1093/bioinformatics/bti448. Epub 2005 Apr 19.

A note on using permutation-based false discovery rate estimates to compare different analysis methods for microarray data.关于使用基于排列的错误发现率估计来比较微阵列数据不同分析方法的说明。

Bioinformatics. 2005 Dec 1;21(23):4280-8. doi: 10.1093/bioinformatics/bti685. Epub 2005 Sep 27.

Re-sampling strategy to improve the estimation of number of null hypotheses in FDR control under strong correlation structures.在强相关结构下改进错误发现率（FDR）控制中零假设数量估计的重采样策略。

BMC Bioinformatics. 2007 May 18;8:157. doi: 10.1186/1471-2105-8-157.

Sample size calculation for microarray experiments with blocked one-way design.具有区组单因素设计的微阵列实验的样本量计算

BMC Bioinformatics. 2009 May 28;10:164. doi: 10.1186/1471-2105-10-164.

Power and sample size calculation for microarray studies.微阵列研究的功效和样本量计算

J Biopharm Stat. 2012;22(1):30-42. doi: 10.1080/10543406.2010.500066.

Sample size for detecting differentially expressed genes in microarray experiments.用于检测微阵列实验中差异表达基因的样本量。

BMC Genomics. 2004 Nov 8;5:87. doi: 10.1186/1471-2164-5-87.

引用本文的文献

Gene Expression Profile of Cultured Human Coronary Arterial Endothelial Cells Exposed to Serum from Chronic Kidney Disease Patients: Role of Signaling Pathway.暴露于慢性肾病患者血清的培养人冠状动脉内皮细胞的基因表达谱：信号通路的作用

Int J Mol Sci. 2025 Apr 15;26(8):3732. doi: 10.3390/ijms26083732.

Dupilumab Therapy Modulates Circulating Inflammatory Mediators in Patients with Prurigo Nodularis.度普利尤单抗治疗可调节结节性痒疹患者循环中的炎症介质。

JID Innov. 2024 Apr 12;4(4):100281. doi: 10.1016/j.xjidi.2024.100281. eCollection 2024 Jul.

Microarray-based detection and expression analysis of drug resistance in an animal model of peritoneal metastasis from colon cancer.基于微阵列的结肠癌腹膜转移动物模型耐药性检测及表达分析。

Clin Exp Metastasis. 2024 Oct;41(5):707-715. doi: 10.1007/s10585-024-10283-5. Epub 2024 Apr 12.

Transcriptomic characterization of Trichoderma harzianum T34 primed tomato plants: assessment of biocontrol agent induced host specific gene expression and plant growth promotion.转录组学分析哈茨木霉 T34 诱导的番茄植株：生物防治剂诱导的宿主特异性基因表达和植物生长促进的评估。

BMC Plant Biol. 2023 Nov 8;23(1):552. doi: 10.1186/s12870-023-04502-6.

The systematic comparison between Gaussian mirror and Model-X knockoff models.高斯镜与 Model-X 伪影模型的系统比较。

Sci Rep. 2023 Apr 4;13(1):5478. doi: 10.1038/s41598-023-32605-5.

Evaluation of a decided sample size in machine learning applications.机器学习应用中确定样本量的评估。

BMC Bioinformatics. 2023 Feb 14;24(1):48. doi: 10.1186/s12859-023-05156-9.

Determination of miRNA expression profile in patients with prostate cancer and benign prostate hyperplasia.前列腺癌和良性前列腺增生患者中 miRNA 表达谱的测定。

Turk J Med Sci. 2022 Jun;52(3):788-795. doi: 10.55730/1300-0144.5374. Epub 2022 Jun 16.

Relative contributions of sex hormones, sex chromosomes, and gonads to sex differences in tissue gene regulation.性激素、性染色体和性腺对组织基因调控性别差异的相对贡献。

Genome Res. 2022 May;32(5):807-824. doi: 10.1101/gr.275965.121. Epub 2022 Apr 8.

Association between Arsenic Level, Gene Expression in Asian Population, and In Vitro Carcinogenic Bladder Tumor.砷水平与亚洲人群基因表达及体外膀胱癌致癌性的关联。

Oxid Med Cell Longev. 2022 Jan 7;2022:3459855. doi: 10.1155/2022/3459855. eCollection 2022.

Cancer Signaling Transcriptome Is Upregulated in Type 2 Diabetes Mellitus.癌症信号转录组在2型糖尿病中上调。

J Clin Med. 2020 Dec 29;10(1):85. doi: 10.3390/jcm10010085.

本文引用的文献

Sample size calculation with dependence adjustment for FDR-control in microarray studies.微阵列研究中针对错误发现率控制进行相关性调整的样本量计算。

Stat Med. 2007 Oct 15;26(23):4219-37. doi: 10.1002/sim.2862.

A simple method for assessing sample sizes in microarray experiments.一种评估微阵列实验样本量的简单方法。

BMC Bioinformatics. 2006 Mar 2;7:106. doi: 10.1186/1471-2105-7-106.

Sample size determination for the false discovery rate.错误发现率的样本量确定

Bioinformatics. 2005 Dec 1;21(23):4263-71. doi: 10.1093/bioinformatics/bti699. Epub 2005 Oct 4.

FDR-controlling testing procedures and sample size determination for microarrays.用于微阵列的错误发现率控制测试程序和样本量确定

Stat Med. 2005 Aug 15;24(15):2267-80. doi: 10.1002/sim.2119.

Sample size for FDR-control in microarray data analysis.微阵列数据分析中用于错误发现率控制的样本量。

Bioinformatics. 2005 Jul 15;21(14):3097-104. doi: 10.1093/bioinformatics/bti456. Epub 2005 Apr 21.

Sample size calculation for multiple testing in microarray data analysis.微阵列数据分析中多重检验的样本量计算。

Biostatistics. 2005 Jan;6(1):157-69. doi: 10.1093/biostatistics/kxh026.

Sample size determination in microarray experiments for class comparison and prognostic classification.用于类别比较和预后分类的微阵列实验中的样本量确定

Biostatistics. 2005 Jan;6(1):27-38. doi: 10.1093/biostatistics/kxh015.

Sample size for identifying differentially expressed genes in microarray experiments.微阵列实验中用于鉴定差异表达基因的样本量。

J Comput Biol. 2004;11(4):714-26. doi: 10.1089/cmb.2004.11.714.

Sample size for gene expression microarray experiments.基因表达微阵列实验的样本量

Bioinformatics. 2005 Apr 15;21(8):1502-8. doi: 10.1093/bioinformatics/bti162. Epub 2004 Nov 25.

Microarray experimental design: power and sample size considerations.微阵列实验设计：效能与样本量考量

Physiol Genomics. 2003 Dec 16;16(1):24-8. doi: 10.1152/physiolgenomics.00037.2003.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

微阵列研究中的功效和样本量估计。

Power and sample size estimation in microarray studies.

机构信息

Division of Personalized Nutrition and Medicine, National Center for Toxicological Research, FDA, Jefferson, AR 72079, USA.

出版信息

BMC Bioinformatics. 2010 Jan 25;11:48. doi: 10.1186/1471-2105-11-48.

DOI:10.1186/1471-2105-11-48

PMID:20100337

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2837028/

Abstract

BACKGROUND

RESULTS

CONCLUSIONS

摘要

微阵列研究中的功效和样本量估计。

Power and sample size estimation in microarray studies.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

微阵列研究中的功效和样本量估计。

Power and sample size estimation in microarray studies.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论