不适当的ROC曲线作为微阵列实验中差异表达基因分析的新工具。

Not proper ROC curves as new tool for the analysis of differentially expressed genes in microarray experiments.

作者信息

Parodi Stefano, Pistoia Vito, Muselli Marco

机构信息

Epidemiology and Biostatistics Section, Scientific Directorate, G. Gaslini Children's Hospital, Genoa, Italy.

出版信息

BMC Bioinformatics. 2008 Oct 3;9:410. doi: 10.1186/1471-2105-9-410.

DOI:10.1186/1471-2105-9-410

PMID:18834513

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2576270/

Abstract

UNLABELLED

Most microarray experiments are carried out with the purpose of identifying genes whose expression varies in relation with specific conditions or in response to environmental stimuli. In such studies, genes showing similar mean expression values between two or more groups are considered as not differentially expressed, even if hidden subclasses with different expression values may exist. In this paper we propose a new method for identifying differentially expressed genes, based on the area between the ROC curve and the rising diagonal (ABCR). ABCR represents a more general approach than the standard area under the ROC curve (AUC), because it can identify both proper (i.e., concave) and not proper ROC curves (NPRC). In particular, NPRC may correspond to those genes that tend to escape standard selection methods.

RESULTS

We assessed the performance of our method using data from a publicly available database of 4026 genes, including 14 normal B cell samples (NBC) and 20 heterogeneous lymphomas (namely: 9 follicular lymphomas and 11 chronic lymphocytic leukemias). Moreover, NBC also included two sub-classes, i.e., 6 heavily stimulated and 8 slightly or not stimulated samples. We identified 1607 differentially expressed genes with an estimated False Discovery Rate of 15%. Among them, 16 corresponded to NPRC and all escaped standard selection procedures based on AUC and t statistics. Moreover, a simple inspection to the shape of such plots allowed to identify the two subclasses in either one class in 13 cases (81%).

CONCLUSION

NPRC represent a new useful tool for the analysis of microarray data.

摘要

未标注

大多数微阵列实验的开展目的是识别那些其表达随特定条件变化或对环境刺激作出反应而改变的基因。在这类研究中，即便可能存在具有不同表达值的隐藏亚类，在两个或更多组之间显示出相似平均表达值的基因也被视为无差异表达。在本文中，我们提出了一种基于ROC曲线与上升对角线之间的面积（ABCR）来识别差异表达基因的新方法。ABCR代表了一种比标准ROC曲线下面积（AUC）更通用的方法，因为它既能识别合适的（即凹形的）ROC曲线，也能识别不合适的ROC曲线（NPRC）。特别是，NPRC可能对应于那些倾向于逃避标准选择方法的基因。

结果

我们使用来自一个包含4026个基因的公开可用数据库的数据评估了我们方法的性能，该数据库包括14个正常B细胞样本（NBC）和20个异质性淋巴瘤（即：9个滤泡性淋巴瘤和11个慢性淋巴细胞白血病）。此外，NBC还包括两个亚类，即6个高度刺激样本和8个轻度刺激或未刺激样本。我们识别出了1607个差异表达基因，估计错误发现率为15%。其中，16个对应于NPRC，并且所有这些基因都逃避了基于AUC和t统计量的标准选择程序。此外，对这些图的形状进行简单检查使得在13个案例（81%）中能够在任何一个类别中识别出这两个亚类。

结论

NPRC是分析微阵列数据的一种新的有用工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cfb/2576270/f5447a8f9cbf/1471-2105-9-410-1.jpg

相似文献

Not proper ROC curves as new tool for the analysis of differentially expressed genes in microarray experiments.

BMC Bioinformatics. 2008 Oct 3;9:410. doi: 10.1186/1471-2105-9-410.

Arrow plot: a new graphical tool for selecting up and down regulated genes and genes differentially expressed on sample subgroups.

BMC Bioinformatics. 2012 Jun 26;13:147. doi: 10.1186/1471-2105-13-147.

A spline function approach for detecting differentially expressed genes in microarray data analysis.

Bioinformatics. 2004 Nov 22;20(17):2954-63. doi: 10.1093/bioinformatics/bth339. Epub 2004 Jun 4.

Cross platform microarray analysis for robust identification of differentially expressed genes.

BMC Bioinformatics. 2007 Mar 8;8 Suppl 1(Suppl 1):S5. doi: 10.1186/1471-2105-8-S1-S5.

Selection of differentially expressed genes in microarray data analysis.

Pharmacogenomics J. 2007 Jun;7(3):212-20. doi: 10.1038/sj.tpj.6500412. Epub 2006 Aug 29.

Tumor classification ranking from microarray data.

BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S21. doi: 10.1186/1471-2164-9-S2-S21.

Identification of differentially expressed gene categories in microarray studies using nonparametric multivariate analysis.

Bioinformatics. 2008 Jan 15;24(2):192-201. doi: 10.1093/bioinformatics/btm583. Epub 2007 Nov 27.

Ranking analysis for identifying differentially expressed genes.

Genomics. 2011 May;97(5):326-9. doi: 10.1016/j.ygeno.2011.03.002. Epub 2011 Mar 22.

Leveraging two-way probe-level block design for identifying differential gene expression with high-density oligonucleotide arrays.

BMC Bioinformatics. 2004 Apr 20;5:42. doi: 10.1186/1471-2105-5-42.

Evaluation of a statistical equivalence test applied to microarray data.

J Biopharm Stat. 2010 Mar;20(2):240-66. doi: 10.1080/10543400903572738.

引用本文的文献

Area under the ROC Curve has the most consistent evaluation for binary classification.

PLoS One. 2024 Dec 23;19(12):e0316019. doi: 10.1371/journal.pone.0316019. eCollection 2024.

The Matthews correlation coefficient (MCC) should replace the ROC AUC as the standard metric for assessing binary classification.

BioData Min. 2023 Feb 17;16(1):4. doi: 10.1186/s13040-023-00322-4.

The clinical meaning of the area under a receiver operating characteristic curve for the evaluation of the performance of disease markers.

Epidemiol Health. 2022;44:e2022088. doi: 10.4178/epih.e2022088. Epub 2022 Oct 17.

Prognostic signatures associated with high infiltration of Tregs in bone metastatic prostate cancer.

Aging (Albany NY). 2021 Jul 6;13(13):17442-17461. doi: 10.18632/aging.203234.

The length of the receiver operating characteristic curve and the two cutoff Youden index within a robust framework for discovery, evaluation, and cutoff estimation in biomarker studies involving improper receiver operating characteristic curves.

Stat Med. 2021 Mar 30;40(7):1767-1789. doi: 10.1002/sim.8869. Epub 2021 Feb 2.

Comprehensive genomic and immunophenotypic analysis of CD4 T cell infiltrating human triple-negative breast cancer.

Cancer Immunol Immunother. 2021 Jun;70(6):1649-1665. doi: 10.1007/s00262-020-02807-1. Epub 2020 Dec 10.

An Ensemble Feature Selection Method for Biomarker Discovery.

Proc IEEE Int Symp Signal Proc Inf Tech. 2017 Dec;2017:416-421. doi: 10.1109/ISSPIT.2017.8388679. Epub 2018 Jun 21.

Arrow plot: a new graphical tool for selecting up and down regulated genes and genes differentially expressed on sample subgroups.

BMC Bioinformatics. 2012 Jun 26;13:147. doi: 10.1186/1471-2105-13-147.

本文引用的文献

HDAC3: taking the SMRT-N-CoRrect road to repression.

Oncogene. 2007 Aug 13;26(37):5439-49. doi: 10.1038/sj.onc.1210612.

Identification of differentially expressed genes and false discovery rate in microarray studies.

Curr Opin Lipidol. 2007 Apr;18(2):187-93. doi: 10.1097/MOL.0b013e3280895d6f.

Identifying genes that contribute most to good classification in microarrays.

BMC Bioinformatics. 2006 Sep 7;7:407. doi: 10.1186/1471-2105-7-407.

Comparison and evaluation of methods for generating differentially expressed gene lists from microarray data.

BMC Bioinformatics. 2006 Jul 26;7:359. doi: 10.1186/1471-2105-7-359.

Microarray analysis and tumor classification.

N Engl J Med. 2006 Jun 8;354(23):2463-72. doi: 10.1056/NEJMra042342.

Significance analysis of ROC indices for comparing diagnostic markers: applications to gene microarray data.

J Biopharm Stat. 2004 Nov;14(4):985-1003. doi: 10.1081/BIP-200035475.

Estimation of false discovery rates in multiple testing: application to gene microarray data.

Biometrics. 2003 Dec;59(4):1071-81. doi: 10.1111/j.0006-341x.2003.00123.x.

ROC curves are a suitable and flexible tool for the analysis of gene expression profiles.

Cytogenet Genome Res. 2003;101(1):90-1. doi: 10.1159/000074404.

Selecting differentially expressed genes from microarray experiments.

Biometrics. 2003 Mar;59(1):133-42. doi: 10.1111/1541-0420.00016.

The central role of receiver operating characteristic (ROC) curves in evaluating tests for the early detection of cancer.

J Natl Cancer Inst. 2003 Apr 2;95(7):511-5. doi: 10.1093/jnci/95.7.511.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

不适当的ROC曲线作为微阵列实验中差异表达基因分析的新工具。

Not proper ROC curves as new tool for the analysis of differentially expressed genes in microarray experiments.

作者信息

机构信息

出版信息

UNLABELLED

RESULTS

CONCLUSION

未标注

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献