Human Genome Center, Institute of Medical Science, The University of Tokyo, Tokyo 108-8639, Japan.
Bioinformatics. 2011 Sep 1;27(17):2399-405. doi: 10.1093/bioinformatics/btr382. Epub 2011 Jun 23.
Due to recent advances in high-throughput technologies, data on various types of genomic annotation have accumulated. These data will be crucially helpful for elucidating the combinatorial logic of transcription. Although several approaches have been proposed for inferring cooperativity among multiple factors, most approaches are haunted by the issues of normalization and threshold values.
In this article, we propose a rank-based non-parametric statistical test for measuring the effects between two gene sets. This method is free from the issues of normalization and threshold value determination for gene expression values. Furthermore, we have proposed an efficient Markov chain Monte Carlo method for calculating an approximate significance value of synergy. We have applied this approach for detecting synergistic combinations of transcription factor binding motifs and histone modifications.
C implementation of the method is available from http://www.hgc.jp/~yshira/software/rankSynergy.zip.
Supplementary data are available at Bioinformatics online.
由于高通量技术的最新进展,各种类型的基因组注释数据已经积累起来。这些数据对于阐明转录的组合逻辑将是非常有帮助的。尽管已经提出了几种方法来推断多个因素之间的协同作用,但大多数方法都受到归一化和阈值问题的困扰。
在本文中,我们提出了一种基于秩的非参数统计检验方法,用于测量两个基因集之间的效应。这种方法不受基因表达值的归一化和阈值确定问题的影响。此外,我们还提出了一种有效的马尔可夫链蒙特卡罗方法来计算协同作用的近似显著值。我们已经将这种方法应用于检测转录因子结合基序和组蛋白修饰之间的协同组合。
该方法的 C 实现可从 http://www.hgc.jp/~yshira/software/rankSynergy.zip 获得。
补充数据可在《生物信息学》在线获得。