• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于准确估计差异表达基因识别中错误发现率的通用方法。

A general method for accurate estimation of false discovery rates in identification of differentially expressed genes.

机构信息

College of Life Science, Hunan Normal University, Changsha, Hunan 410087, China and Department of Biostatistics and Epidemiology, Georgia Regents University, Augusta, GA 30912-4900, USA.

出版信息

Bioinformatics. 2014 Jul 15;30(14):2018-25. doi: 10.1093/bioinformatics/btu124. Epub 2014 Mar 14.

DOI:10.1093/bioinformatics/btu124
PMID:24632499
Abstract

UNLABELLED

The 'omic' data such as genomic data, transcriptomic data, proteomic data and single nucleotide polymorphism data have been rapidly growing. The omic data are large-scale and high-throughput data. Such data challenge traditional statistical methodologies and require multiple tests. Several multiple-testing procedures such as Bonferroni procedure, Benjamini-Hochberg (BH) procedure and Westfall-Young procedure have been developed, among which some control family-wise error rate and the others control false discovery rate (FDR). These procedures are valid in some cases and cannot be applied to all types of large-scale data. To address this statistically challenging problem in the analysis of the omic data, we propose a general method for generating a set of multiple-testing procedures. This method is based on the BH theorems. By choosing a C-value, one can realize a specific multiple-testing procedure. For example, by setting C = 1.22, our method produces the BH procedure. With C < 1.22, our method generates procedures of weakly controlling FDR, and with C > 1.22, the procedures strongly control FDR. Those with C = G (number of genes or tests) and C = 0 are, respectively, the Bonferroni procedure and the single-testing procedure. These are the two extreme procedures in this family. To let one choose an appropriate multiple-testing procedure in practice, we develop an algorithm by which FDR can be correctly and reliably estimated. Simulated results show that our method works well for an accurate estimation of FDR in various scenarios, and we illustrate the applications of our method with three real datasets.

AVAILABILITY AND IMPLEMENTATION

Our program is implemented in Matlab and is available upon request.

摘要

未加标签

基因组数据、转录组数据、蛋白质组数据和单核苷酸多态性数据等“组学”数据呈快速增长趋势。这些数据规模庞大且高通量,对传统统计学方法提出了挑战,需要进行多次检验。目前已经开发了多种多重检验程序,如 Bonferroni 程序、Benjamini-Hochberg(BH)程序和 Westfall-Young 程序等,其中一些控制总体错误率,另一些控制假发现率(FDR)。这些程序在某些情况下是有效的,但不能应用于所有类型的大规模数据。为了解决分析“组学”数据时的这一统计学难题,我们提出了一种生成多重检验程序集的通用方法。该方法基于 BH 定理,通过选择 C 值,可以实现特定的多重检验程序。例如,当 C = 1.22 时,我们的方法产生 BH 程序;当 C < 1.22 时,我们的方法生成弱控制 FDR 的程序;当 C > 1.22 时,程序会强控制 FDR。当 C = G(基因或检验的数量)和 C = 0 时,分别为 Bonferroni 程序和单检验程序,它们是该家族中的两个极端程序。为了让人们在实际中选择合适的多重检验程序,我们开发了一种算法,可以正确、可靠地估计 FDR。模拟结果表明,在各种情况下,我们的方法都能很好地估计 FDR,并且我们通过三个真实数据集说明了该方法的应用。

可用性和实现

我们的程序是用 Matlab 编写的,如有需要可以提供。

相似文献

1
A general method for accurate estimation of false discovery rates in identification of differentially expressed genes.一种用于准确估计差异表达基因识别中错误发现率的通用方法。
Bioinformatics. 2014 Jul 15;30(14):2018-25. doi: 10.1093/bioinformatics/btu124. Epub 2014 Mar 14.
2
Work efficiency: a new criterion for comprehensive comparison and evaluation of statistical methods in large-scale identification of differentially expressed genes.工作效率:大规模差异表达基因识别中统计方法综合比较与评价的新准则。
Genomics. 2011 Nov;98(5):390-9. doi: 10.1016/j.ygeno.2011.05.006. Epub 2011 Jun 30.
3
Identifying differentially expressed genes using false discovery rate controlling procedures.使用错误发现率控制程序识别差异表达基因。
Bioinformatics. 2003 Feb 12;19(3):368-75. doi: 10.1093/bioinformatics/btf877.
4
An adaptive single-step FDR procedure with applications to DNA microarray analysis.一种适用于DNA微阵列分析的自适应单步错误发现率程序。
Biom J. 2007 Feb;49(1):127-35. doi: 10.1002/bimj.200610316.
5
A classification approach for DNA methylation profiling with bisulfite next-generation sequencing data.基于亚硫酸氢盐测序的 DNA 甲基化分析的分类方法。
Bioinformatics. 2014 Jan 15;30(2):172-9. doi: 10.1093/bioinformatics/btt674. Epub 2013 Nov 21.
6
Quick calculation for sample size while controlling false discovery rate with application to microarray analysis.在控制错误发现率的同时进行样本量的快速计算及其在微阵列分析中的应用。
Bioinformatics. 2007 Mar 15;23(6):739-46. doi: 10.1093/bioinformatics/btl664. Epub 2007 Jan 19.
7
Estimation of false discovery proportion under general dependence.一般相关性下错误发现比例的估计
Bioinformatics. 2006 Dec 15;22(24):3025-31. doi: 10.1093/bioinformatics/btl527. Epub 2006 Oct 17.
8
Re-sampling strategy to improve the estimation of number of null hypotheses in FDR control under strong correlation structures.在强相关结构下改进错误发现率(FDR)控制中零假设数量估计的重采样策略。
BMC Bioinformatics. 2007 May 18;8:157. doi: 10.1186/1471-2105-8-157.
9
An investigation on performance of Significance Analysis of Microarray (SAM) for the comparisons of several treatments with one control in the presence of small-variance genes.在存在小方差基因的情况下,对微阵列显著性分析(SAM)用于几种处理与一个对照进行比较的性能研究。
Biom J. 2008 Oct;50(5):801-23. doi: 10.1002/bimj.200710467.
10
FDR control by the BH procedure for two-sided correlated tests with implications to gene expression data analysis.通过BH程序对双侧相关检验进行错误发现率控制及其对基因表达数据分析的意义。
Biom J. 2007 Feb;49(1):107-26. doi: 10.1002/bimj.200510313.

引用本文的文献

1
Genomic Signatures of Environmental Adaptation in (Fagaceae).壳斗科植物环境适应性的基因组特征
Plants (Basel). 2025 Apr 5;14(7):1128. doi: 10.3390/plants14071128.
2
The Significant Effects of Threshold Selection for Advancing Nitrogen Use Efficiency in Whole Genome of Bread Wheat.阈值选择对提高面包小麦全基因组氮素利用效率的显著影响
Plant Direct. 2025 Jan 21;9(1):e70036. doi: 10.1002/pld3.70036. eCollection 2025 Jan.
3
High intraperitoneal interleukin-6 levels predict ultrafiltration (UF) insufficiency in peritoneal dialysis patients: A prospective cohort study.
高腹腔内白细胞介素-6水平预测腹膜透析患者超滤不足:一项前瞻性队列研究。
Front Med (Lausanne). 2022 Aug 10;9:836861. doi: 10.3389/fmed.2022.836861. eCollection 2022.
4
Null-free False Discovery Rate Control Using Decoy Permutations.使用诱饵排列的无空值错误发现率控制
Acta Math Appl Sin. 2022;38(2):235-253. doi: 10.1007/s10255-022-1077-5. Epub 2022 Apr 9.
5
COL1A1 Is a Potential Prognostic Biomarker and Correlated with Immune Infiltration in Mesothelioma.COL1A1 是间皮瘤的一个潜在预后生物标志物,并与免疫浸润相关。
Biomed Res Int. 2021 Jan 4;2021:5320941. doi: 10.1155/2021/5320941. eCollection 2021.
6
Claudin-6 is a single prognostic marker and functions as a tumor-promoting gene in a subgroup of intestinal type gastric cancer.Claudin-6 是一个单一的预后标志物,在肠型胃癌的亚组中作为促进肿瘤的基因发挥作用。
Gastric Cancer. 2020 May;23(3):403-417. doi: 10.1007/s10120-019-01014-x. Epub 2019 Oct 25.
7
lncRNA-ATB functions as a competing endogenous RNA to promote YAP1 by sponging miR-590-5p in malignant melanoma.长链非编码 RNA-ATB 通过海绵吸附 miR-590-5p 促进恶性黑素瘤中 YAP1 的表达,作为竞争性内源性 RNA。
Int J Oncol. 2018 Sep;53(3):1094-1104. doi: 10.3892/ijo.2018.4454. Epub 2018 Jun 25.
8
Retinal metabolic events in preconditioning light stress as revealed by wide-spectrum targeted metabolomics.通过广谱靶向代谢组学揭示的预处理光应激中的视网膜代谢事件。
Metabolomics. 2017;13(3):22. doi: 10.1007/s11306-016-1156-9. Epub 2017 Jan 20.
9
Expression analysis of apolipoprotein E and its associated genes in gastric cancer.载脂蛋白E及其相关基因在胃癌中的表达分析
Oncol Lett. 2015 Sep;10(3):1309-1314. doi: 10.3892/ol.2015.3447. Epub 2015 Jul 1.
10
Mantle Branch-Specific RNA Sequences of Moon Scallop Amusium pleuronectes to Identify Shell Color-Associated Genes.日月贝特定外套膜分支的RNA序列用于鉴定与壳色相关的基因
PLoS One. 2015 Oct 23;10(10):e0141390. doi: 10.1371/journal.pone.0141390. eCollection 2015.