Suppr超能文献

CODC:一种基于 Copula 的差异共表达识别模型。

CODC: a Copula-based model to identify differential coexpression.

机构信息

Centrum Wiskunde & Informatica, Life Sciences & Health, 1098 XG, Amsterdam, The Netherlands.

Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India.

出版信息

NPJ Syst Biol Appl. 2020 Jun 19;6(1):20. doi: 10.1038/s41540-020-0137-9.

Abstract

Differential coexpression has recently emerged as a new way to establish a fundamental difference in expression pattern among a group of genes between two populations. Earlier methods used some scoring techniques to detect changes in correlation patterns of a gene pair in two conditions. However, modeling differential coexpression by means of finding differences in the dependence structure of the gene pair has hitherto not been carried out. We exploit a copula-based framework to model differential coexpression between gene pairs in two different conditions. The Copula is used to model the dependency between expression profiles of a gene pair. For a gene pair, the distance between two joint distributions produced by copula is served as differential coexpression. We used five pan-cancer TCGA RNA-Seq data to evaluate the model that outperforms the existing state of the art. Moreover, the proposed model can detect a mild change in the coexpression pattern across two conditions. For noisy expression data, the proposed method performs well because of the popular scale-invariant property of copula. In addition, we have identified differentially coexpressed modules by applying hierarchical clustering on the distance matrix. The identified modules are analyzed through Gene Ontology terms and KEGG pathway enrichment analysis.

摘要

差异共表达最近已成为一种新方法,用于在两个群体的一组基因之间建立表达模式的基本差异。早期的方法使用了一些评分技术来检测两个条件下基因对相关模式的变化。然而,迄今为止,还没有通过寻找基因对的依赖结构差异来对差异共表达进行建模。我们利用基于 Copula 的框架来对两个不同条件下的基因对进行差异共表达建模。Copula 用于对基因对的表达谱之间的相关性进行建模。对于一对基因,由 Copula 产生的两个联合分布之间的距离被用作差异共表达。我们使用五个泛癌 TCGA RNA-Seq 数据来评估该模型,该模型优于现有的最先进的方法。此外,该模型可以检测到两个条件之间共表达模式的微小变化。对于嘈杂的表达数据,由于 Copula 的流行的尺度不变特性,该方法表现良好。此外,我们通过对距离矩阵进行层次聚类来识别差异共表达模块。通过基因本体术语和 KEGG 途径富集分析来分析所识别的模块。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/58db/7305108/882f1f6cf895/41540_2020_137_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验