Bioinformatics Group, CSIRO Livestock Industries, Queensland Bioscience Precinct, 306 Carmody Road, St. Lucia, Brisbane, Queensland 4067, Australia.
Bioinformatics. 2010 Apr 1;26(7):896-904. doi: 10.1093/bioinformatics/btq051. Epub 2010 Feb 9.
Although transcription factors (TF) play a central regulatory role, their detection from expression data is limited due to their low, and often sparse, expression. In order to fill this gap, we propose a regulatory impact factor (RIF) metric to identify critical TF from gene expression data.
To substantiate the generality of RIF, we explore a set of experiments spanning a wide range of scenarios including breast cancer survival, fat, gonads and sex differentiation. We show that the strength of RIF lies in its ability to simultaneously integrate three sources of information into a single measure: (i) the change in correlation existing between the TF and the differentially expressed (DE) genes; (ii) the amount of differential expression of DE genes; and (iii) the abundance of DE genes. As a result, RIF analysis assigns an extreme score to those TF that are consistently most differentially co-expressed with the highly abundant and highly DE genes (RIF1), and to those TF with the most altered ability to predict the abundance of DE genes (RIF2). We show that RIF analysis alone recovers well-known experimentally validated TF for the processes studied. The TF identified confirm the importance of PPAR signaling in adipose development and the importance of transduction of estrogen signals in breast cancer survival and sexual differentiation. We argue that RIF has universal applicability, and advocate its use as a promising hypotheses generating tool for the systematic identification of novel TF not yet documented as critical.
尽管转录因子 (TF) 起着核心调节作用,但由于其低表达且常常稀疏,因此从表达数据中检测它们受到限制。为了弥补这一差距,我们提出了一种调控影响因子 (RIF) 度量标准,用于从基因表达数据中识别关键 TF。
为了证实 RIF 的通用性,我们探索了一组跨越广泛场景的实验,包括乳腺癌生存、脂肪、性腺和性别分化。我们表明,RIF 的优势在于它能够将三种信息源整合到一个单一的度量标准中:(i)TF 与差异表达(DE)基因之间存在的相关性变化;(ii)DE 基因的差异表达量;和(iii)DE 基因的丰度。结果,RIF 分析为那些与高度丰富和高度 DE 基因最一致地差异共表达的 TF(RIF1)以及那些具有最改变的能力来预测 DE 基因丰度的 TF(RIF2)分配了极端分数。我们表明,RIF 分析本身就可以很好地恢复针对所研究过程的经过实验验证的已知 TF。确定的 TF 证实了 PPAR 信号在脂肪发育中的重要性,以及雌激素信号转导在乳腺癌生存和性别分化中的重要性。我们认为 RIF 具有普遍适用性,并主张将其用作一种有前途的假设生成工具,用于系统识别尚未记录为关键的新型 TF。