Suppr超能文献

基尼相关系数在转录组分析中推断调控关系的应用。

Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis.

机构信息

School of Plant Sciences, University of Arizona, Tucson, Arizona 85721, USA.

出版信息

Plant Physiol. 2012 Sep;160(1):192-203. doi: 10.1104/pp.112.201962. Epub 2012 Jul 13.

Abstract

One of the computational challenges in plant systems biology is to accurately infer transcriptional regulation relationships based on correlation analyses of gene expression patterns. Despite several correlation methods that are applied in biology to analyze microarray data, concerns regarding the compatibility of these methods with the gene expression data profiled by high-throughput RNA transcriptome sequencing (RNA-Seq) technology have been raised. These concerns are mainly due to the fact that the distribution of read counts in RNA-Seq experiments is different from that of fluorescence intensities in microarray experiments. Therefore, a comprehensive evaluation of the existing correlation methods and, if necessary, introduction of novel methods into biology is appropriate. In this study, we compared four existing correlation methods used in microarray analysis and one novel method called the Gini correlation coefficient on previously published microarray-based and sequencing-based gene expression data in Arabidopsis (Arabidopsis thaliana) and maize (Zea mays). The comparisons were performed on more than 11,000 regulatory relationships in Arabidopsis, including 8,929 pairs of transcription factors and target genes. Our analyses pinpointed the strengths and weaknesses of each method and indicated that the Gini correlation can compensate for the shortcomings of the Pearson correlation, the Spearman correlation, the Kendall correlation, and the Tukey's biweight correlation. The Gini correlation method, with the other four evaluated methods in this study, was implemented as an R package named rsgcc that can be utilized as an alternative option for biologists to perform clustering analyses of gene expression patterns or transcriptional network analyses.

摘要

植物系统生物学中的一个计算挑战是根据基因表达模式的相关分析来准确推断转录调控关系。尽管生物学中已经应用了几种相关方法来分析微阵列数据,但人们对这些方法与高通量 RNA 转录组测序 (RNA-Seq) 技术所 profiling 的基因表达数据的兼容性表示担忧。这些担忧主要是由于 RNA-Seq 实验中的读取计数分布与微阵列实验中的荧光强度分布不同。因此,对现有相关方法进行全面评估,并在必要时将新方法引入生物学是合适的。在这项研究中,我们比较了微阵列分析中使用的四种现有相关方法和一种新方法,即基尼相关系数,在拟南芥(Arabidopsis thaliana)和玉米(Zea mays)的先前发表的基于微阵列和基于测序的基因表达数据上。在 Arabidopsis 中进行了超过 11000 个调控关系的比较,包括 8929 对转录因子和靶基因。我们的分析指出了每种方法的优缺点,并表明基尼相关可以弥补 Pearson 相关、Spearman 相关、Kendall 相关和 Tukey 的双权相关的缺点。基尼相关方法与本研究中评估的其他四种方法一起实现为一个名为 rsgcc 的 R 包,可以作为生物学家替代选项,用于执行基因表达模式或转录网络分析的聚类分析。

相似文献

4
Differential correlation for sequencing data.测序数据的差异相关性
BMC Res Notes. 2017 Jan 19;10(1):54. doi: 10.1186/s13104-016-2331-9.

引用本文的文献

4
Design, execution, and interpretation of plant RNA-seq analyses.植物RNA测序分析的设计、执行与解读
Front Plant Sci. 2023 Jun 30;14:1135455. doi: 10.3389/fpls.2023.1135455. eCollection 2023.

本文引用的文献

2
Detecting novel associations in large data sets.在大型数据集 中检测新的关联。
Science. 2011 Dec 16;334(6062):1518-24. doi: 10.1126/science.1205438.
8
Cytoscape 2.8: new features for data integration and network visualization.Cytoscape 2.8:新的数据集成和网络可视化功能。
Bioinformatics. 2011 Feb 1;27(3):431-2. doi: 10.1093/bioinformatics/btq675. Epub 2010 Dec 12.
9
AGRIS: the Arabidopsis Gene Regulatory Information Server, an update.AGRIS:拟南芥基因调控信息服务器,更新版
Nucleic Acids Res. 2011 Jan;39(Database issue):D1118-22. doi: 10.1093/nar/gkq1120. Epub 2010 Nov 8.
10
Gene networks controlling the initiation of flower development.控制花发育起始的基因网络。
Trends Genet. 2010 Dec;26(12):519-27. doi: 10.1016/j.tig.2010.09.001. Epub 2010 Oct 13.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验