Suppr超能文献

用于对转录组进行聚类和可视化的多重累积概率。

Multiple-cumulative probabilities used to cluster and visualize transcriptomes.

作者信息

Jia Xingang, Liu Yisu, Han Qiuhong, Lu Zuhong

机构信息

School of Mathematics Southeast University Nanjing China.

State Key Laboratory of Bioelectronics School of Biological Science and Medical Engineering Southeast University Nanjing China.

出版信息

FEBS Open Bio. 2017 Nov 13;7(12):2008-2020. doi: 10.1002/2211-5463.12327. eCollection 2017 Dec.

Abstract

Analysis of gene expression data by clustering and visualizing played a central role in obtaining biological knowledge. Here, we used Pearson's correlation coefficient of multiple-cumulative probabilities (PCC-MCP) of genes to define the similarity of gene expression behaviors. To answer the challenge of the high-dimensional MCPs, we used icc-cluster, a clustering algorithm that obtained solutions by iterating clustering centers, with PCC-MCP to group genes. We then used -statistic stochastic neighbor embedding (t-SNE) of KC-data to generate optimal maps for clusters of MCP (t-SNE-MCP-O maps). From the analysis of several transcriptome data sets, we demonstrated clear advantages for using icc-cluster with PCC-MCP over commonly used clustering methods. t-SNE-MCP-O was also shown to give clearly projecting boundaries for clusters of PCC-MCP, which made the relationships between clusters easy to visualize and understand.

摘要

通过聚类和可视化分析基因表达数据在获取生物学知识方面发挥了核心作用。在此,我们使用基因的多重累积概率的皮尔逊相关系数(PCC-MCP)来定义基因表达行为的相似性。为应对高维MCP的挑战,我们使用了icc-聚类算法,这是一种通过迭代聚类中心来获得解决方案的聚类算法,结合PCC-MCP对基因进行分组。然后,我们使用KC数据的 - 统计随机邻域嵌入(t-SNE)来生成MCP聚类的最优图谱(t-SNE-MCP-O图谱)。通过对多个转录组数据集的分析,我们证明了将icc-聚类与PCC-MCP结合使用相对于常用聚类方法具有明显优势。t-SNE-MCP-O还显示出能为PCC-MCP聚类给出清晰的投影边界,这使得聚类之间的关系易于可视化和理解。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验