Suppr超能文献

增强对微阵列数据生物学解读的信心:显著GO类别的功能一致性。

Gaining confidence in biological interpretation of the microarray data: the functional consistence of the significant GO categories.

作者信息

Yang Da, Li Yanhui, Xiao Hui, Liu Qing, Zhang Min, Zhu Jing, Ma Wencai, Yao Chen, Wang Jing, Wang Dong, Guo Zheng, Yang Baofeng

机构信息

Department of Bioinformatics, Bio-pharmaceutical Key Laboratory of Heilongjiang Province-Incubator of State Key Laboratory, Harbin Medical University, Harbin 150086, China.

出版信息

Bioinformatics. 2008 Jan 15;24(2):265-71. doi: 10.1093/bioinformatics/btm558. Epub 2007 Nov 15.

Abstract

MOTIVATION

In microarray studies, numerous tools are available for functional enrichment analysis based on GO categories. Most of these tools, due to their requirement of a prior threshold for designating genes as differentially expressed genes (DEGs), are categorized as threshold-dependent methods that often suffer from a major criticism on their changing results with different thresholds.

RESULTS

In the present article, by considering the inherent correlation structure of the GO categories, a continuous measure based on semantic similarity of GO categories is proposed to investigate the functional consistence (or stability) of threshold-dependent methods. The results from several datasets show when simply counting overlapping categories between two groups, the significant category groups selected under different DEG thresholds are seemingly very different. However, based on the semantic similarity measure proposed in this article, the results are rather functionally consistent for a wide range of DEG thresholds. Moreover, we find that the functional consistence of gene lists ranked by SAM metric behaves relatively robust against changing DEG thresholds.

AVAILABILITY

Source code in R is available on request from the authors.

摘要

动机

在微阵列研究中,有许多工具可用于基于基因本体(GO)类别的功能富集分析。这些工具中的大多数,由于需要事先设定一个阈值来将基因指定为差异表达基因(DEG),因此被归类为依赖阈值的方法,而这些方法常常因不同阈值会导致结果变化而受到主要批评。

结果

在本文中,通过考虑GO类别的内在相关结构,提出了一种基于GO类别语义相似性的连续度量方法,以研究依赖阈值方法的功能一致性(或稳定性)。几个数据集的结果表明,当简单地计算两组之间的重叠类别时,在不同的DEG阈值下选择的显著类别组似乎非常不同。然而,基于本文提出的语义相似性度量,在广泛的DEG阈值范围内,结果在功能上相当一致。此外,我们发现,按SAM度量排名的基因列表的功能一致性在DEG阈值变化时表现得相对稳健。

可用性

可向作者索取R语言的源代码。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验