一种基于生物学的共表达分析度量方法。

A biologically inspired measure for coexpression analysis.

机构信息

Machine Intelligence Unit, Indian Statistical Institute, 203 B.T. Road, Kolkata 700108, India.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2011 Jul-Aug;8(4):929-42. doi: 10.1109/TCBB.2010.106.

DOI:10.1109/TCBB.2010.106

Abstract

Two genes are said to be coexpressed if their expression levels have a similar spatial or temporal pattern. Ever since the profiling of gene microarrays has been in progress, computational modeling of coexpression has acquired a major focus. As a result, several similarity/distance measures have evolved over time to quantify coexpression similarity/dissimilarity between gene pairs. Of these, correlation coefficient has been established to be a suitable quantifier of pairwise coexpression. In general, correlation coefficient is good for symbolizing linear dependence, but not for nonlinear dependence. In spite of this drawback, it outperforms many other existing measures in modeling the dependency in biological data. In this paper, for the first time, we point out a significant weakness of the existing similarity/distance measures, including the standard correlation coefficient, in modeling pairwise coexpression of genes. A novel measure, called BioSim, which assumes values between -1 and +1 corresponding to negative and positive dependency and 0 for independency, is introduced. The computation of BioSim is based on the aggregation of stepwise relative angular deviation of the expression vectors considered. The proposed measure is analytically suitable for modeling coexpression as it accounts for the features of expression similarity, expression deviation and also the relative dependence. It is demonstrated how the proposed measure is better able to capture the degree of coexpression between a pair of genes as compared to several other existing ones. The efficacy of the measure is statistically analyzed by integrating it with several module-finding algorithms based on coexpression values and then applying it on synthetic and biological data. The annotation results of the coexpressed genes as obtained from gene ontology establish the significance of the introduced measure. By further extending the BioSim measure, it has been shown that one can effectively identify the variability in the expression patterns over multiple phenotypes. We have also extended BioSim to figure out pairwise differential expression pattern and coexpression dynamics. The significance of these studies is shown based on the analysis over several real-life data sets. The computation of the measure by focusing on stepwise time points also makes it effective to identify partially coexpressed genes. On the whole, we put forward a complete framework for coexpression analysis based on the BioSim measure.

摘要

如果两个基因的表达水平具有相似的时空模式，则称它们为共表达。自从基因微阵列的分析进展以来，共表达的计算建模已经成为一个主要焦点。因此，随着时间的推移，已经出现了几种相似性/距离度量标准来量化基因对之间的共表达相似性/差异性。在这些度量标准中，相关系数已被证明是衡量基因对共表达的合适量化标准。一般来说，相关系数擅长表示线性相关性，但不擅长表示非线性相关性。尽管存在这一缺点，但它在模拟生物数据中的相关性方面优于许多其他现有方法。在本文中，我们首次指出了现有相似性/距离度量标准（包括标准相关系数）在模拟基因对共表达方面的一个显著弱点。引入了一种新的度量标准，称为 BioSim，它的值在-1 到+1 之间，分别对应于负相关性和正相关性，而 0 表示独立性。BioSim 的计算基于所考虑的表达向量的逐步相对角度偏差的聚合。该方法在分析上适合于模拟共表达，因为它考虑了表达相似性、表达偏差以及相对依赖性的特征。与其他几种现有方法相比，该方法能够更好地捕捉基因对之间的共表达程度。通过将该方法与基于共表达值的几种模块发现算法集成，并将其应用于合成和生物数据，对该方法的有效性进行了统计分析。从基因本体获得的共表达基因的注释结果证明了引入的度量标准的重要性。通过进一步扩展 BioSim 度量标准，可以有效地识别多个表型中表达模式的可变性。我们还扩展了 BioSim 以找出基因对之间的差异表达模式和共表达动态。这些研究的意义基于对几个真实数据集的分析。通过关注逐步时间点进行度量的计算，也可以有效地识别部分共表达基因。总的来说，我们提出了一个基于 BioSim 度量标准的共表达分析完整框架。

相似文献

A biologically inspired measure for coexpression analysis.一种基于生物学的共表达分析度量方法。

IEEE/ACM Trans Comput Biol Bioinform. 2011 Jul-Aug;8(4):929-42. doi: 10.1109/TCBB.2010.106.

CODC: a Copula-based model to identify differential coexpression.CODC：一种基于 Copula 的差异共表达识别模型。

NPJ Syst Biol Appl. 2020 Jun 19;6(1):20. doi: 10.1038/s41540-020-0137-9.

Subspace differential coexpression analysis: problem definition and a general approach.子空间微分共表达分析：问题定义与通用方法。

Pac Symp Biocomput. 2010:145-56.

CCor: A whole genome network-based similarity measure between two genes.CCor：一种基于全基因组网络的两个基因之间的相似性度量。

Biometrics. 2016 Dec;72(4):1216-1225. doi: 10.1111/biom.12508. Epub 2016 Mar 8.

Subdimension-based similarity measure for DNA microarray data clustering.基于子维度的DNA微阵列数据聚类相似性度量

Phys Rev E Stat Nonlin Soft Matter Phys. 2006 Oct;74(4 Pt 1):041906. doi: 10.1103/PhysRevE.74.041906. Epub 2006 Oct 9.

An information theoretic exploratory method for learning patterns of conditional gene coexpression from microarray data.一种从微阵列数据中学习条件基因共表达模式的信息论探索方法。

IEEE/ACM Trans Comput Biol Bioinform. 2008 Jan-Mar;5(1):15-24. doi: 10.1109/TCBB.2007.1056.

A general framework for analyzing data from two short time-series microarray experiments.用于分析两个短时间序列微阵列实验数据的通用框架。

IEEE/ACM Trans Comput Biol Bioinform. 2011 Jan-Mar;8(1):14-26. doi: 10.1109/TCBB.2009.51.

Correlation between gene expression and GO semantic similarity.基因表达与基因本体语义相似性之间的相关性。

IEEE/ACM Trans Comput Biol Bioinform. 2005 Oct-Dec;2(4):330-8. doi: 10.1109/TCBB.2005.50.

Fuzzy measures on the Gene Ontology for gene product similarity.用于基因产物相似性的基因本体模糊测度。

IEEE/ACM Trans Comput Biol Bioinform. 2006 Jul-Sep;3(3):263-74. doi: 10.1109/TCBB.2006.37.

A methodology for the analysis of differential coexpression across the human lifespan.一种分析人类全生命周期差异共表达的方法。

BMC Bioinformatics. 2009 Sep 22;10:306. doi: 10.1186/1471-2105-10-306.

引用本文的文献

Optimal ranking and directional signature classification using the integral strategy of multi-objective optimization-based association rule mining of multi-omics data.使用基于多组学数据的多目标优化关联规则挖掘的积分策略进行最优排序和方向特征分类。

Front Bioinform. 2023 Jul 27;3:1182176. doi: 10.3389/fbinf.2023.1182176. eCollection 2023.

Comprehensive Analysis of MicroRNA⁻Messenger RNA from White Yak Testis Reveals the Differentially Expressed Molecules Involved in Development and Reproduction.从白牦牛睾丸中进行 miRNA-mRNA 综合分析揭示了发育和繁殖过程中差异表达的分子。

Int J Mol Sci. 2018 Oct 9;19(10):3083. doi: 10.3390/ijms19103083.

Natural Cubic Spline Regression Modeling Followed by Dynamic Network Reconstruction for the Identification of Radiation-Sensitivity Gene Association Networks from Time-Course Transcriptome Data.基于自然三次样条回归建模并结合动态网络重构，从时间进程转录组数据中识别辐射敏感性基因关联网络。

PLoS One. 2016 Aug 9;11(8):e0160791. doi: 10.1371/journal.pone.0160791. eCollection 2016.

Exploiting identifiability and intergene correlation for improved detection of differential expression.利用可识别性和基因间相关性来改进差异表达检测。

ISRN Bioinform. 2013 Jun 3;2013:404717. doi: 10.1155/2013/404717. eCollection 2013.

Nonlinear dependence in the discovery of differentially expressed genes.差异表达基因发现中的非线性依赖性。

ISRN Bioinform. 2012 Apr 12;2012:564715. doi: 10.5402/2012/564715. eCollection 2012.

Construction and comparison of gene co-expression networks shows complex plant immune responses.构建和比较基因共表达网络揭示了复杂的植物免疫反应。

PeerJ. 2014 Oct 9;2:e610. doi: 10.7717/peerj.610. eCollection 2014.

Functional clustering of time series gene expression data by Granger causality.基于格兰杰因果关系的时间序列基因表达数据的功能聚类

BMC Syst Biol. 2012 Oct 30;6:137. doi: 10.1186/1752-0509-6-137.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一种基于生物学的共表达分析度量方法。

A biologically inspired measure for coexpression analysis.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献