Suppr超能文献

LGL:使用一种用于可视化超大型生物网络的算法创建蛋白质功能图谱。

LGL: creating a map of protein function with an algorithm for visualizing very large biological networks.

作者信息

Adai Alex T, Date Shailesh V, Wieland Shannon, Marcotte Edward M

机构信息

Center for Systems and Synthetic Biology, and Institute for Cellular and Molecular Biology, 1 University Avenue, University of Texas, Austin, TX 78712-1095, USA.

出版信息

J Mol Biol. 2004 Jun 25;340(1):179-90. doi: 10.1016/j.jmb.2004.04.047.

Abstract

Networks are proving to be central to the study of gene function, protein-protein interaction, and biochemical pathway data. Visualization of networks is important for their study, but visualization tools are often inadequate for working with very large biological networks. Here, we present an algorithm, called large graph layout (LGL), which can be used to dynamically visualize large networks on the order of hundreds of thousands of vertices and millions of edges. LGL applies a force-directed iterative layout guided by a minimal spanning tree of the network in order to generate coordinates for the vertices in two or three dimensions, which are subsequently visualized and interactively navigated with companion programs. We demonstrate the use of LGL in visualizing an extensive protein map summarizing the results of approximately 21 billion sequence comparisons between 145579 proteins from 50 genomes. Proteins are positioned in the map according to sequence homology and gene fusions, with the map ultimately serving as a theoretical framework that integrates inferences about gene function derived from sequence homology, remote homology, gene fusions, and higher-order fusions. We confirm that protein neighbors in the resulting map are functionally related, and that distinct map regions correspond to distinct cellular systems, enabling a computational strategy for discovering proteins' functions on the basis of the proteins' map positions. Using the map produced by LGL, we infer general functions for 23 uncharacterized protein families.

摘要

网络已被证明是基因功能、蛋白质-蛋白质相互作用及生化途径数据研究的核心。网络可视化对于其研究很重要,但可视化工具在处理非常大的生物网络时往往并不适用。在此,我们提出一种名为大图布局(LGL)的算法,它可用于动态可视化具有数十万顶点和数百万条边规模的大型网络。LGL应用一种由网络的最小生成树引导的力导向迭代布局,以便在二维或三维中生成顶点的坐标,随后使用配套程序对其进行可视化和交互式导航。我们展示了LGL在可视化一个广泛的蛋白质图谱中的应用,该图谱总结了来自50个基因组的145579种蛋白质之间约210亿次序列比较的结果。蛋白质根据序列同源性和基因融合定位在图谱中,该图谱最终作为一个理论框架,整合了从序列同源性、远源同源性、基因融合和高阶融合中得出的关于基因功能的推断。我们证实所得图谱中的蛋白质邻居在功能上相关,且不同的图谱区域对应不同的细胞系统,从而实现了一种基于蛋白质图谱位置发现蛋白质功能的计算策略。使用LGL生成的图谱,我们推断出23个未表征蛋白质家族的一般功能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验