Suppr超能文献

新型马尔可夫-香农熵模型评估复杂网络的连接质量:从分子到细胞通路、寄生虫-宿主、神经、工业和法律-社会网络。

New Markov-Shannon Entropy models to assess connectivity quality in complex networks: from molecular to cellular pathway, Parasite-Host, Neural, Industry, and Legal-Social networks.

机构信息

Department of Microbiology & Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain.

出版信息

J Theor Biol. 2012 Jan 21;293:174-88. doi: 10.1016/j.jtbi.2011.10.016. Epub 2011 Oct 25.

Abstract

Graph and Complex Network theory is expanding its application to different levels of matter organization such as molecular, biological, technological, and social networks. A network is a set of items, usually called nodes, with connections between them, which are called links or edges. There are many different experimental and/or theoretical methods to assign node-node links depending on the type of network we want to construct. Unfortunately, the use of a method for experimental reevaluation of the entire network is very expensive in terms of time and resources; thus the development of cheaper theoretical methods is of major importance. In addition, different methods to link nodes in the same type of network are not totally accurate in such a way that they do not always coincide. In this sense, the development of computational methods useful to evaluate connectivity quality in complex networks (a posteriori of network assemble) is a goal of major interest. In this work, we report for the first time a new method to calculate numerical quality scores S(L(ij)) for network links L(ij) (connectivity) based on the Markov-Shannon Entropy indices of order k-th (θ(k)) for network nodes. The algorithm may be summarized as follows: (i) first, the θ(k)(j) values are calculated for all j-th nodes in a complex network already constructed; (ii) A Linear Discriminant Analysis (LDA) is used to seek a linear equation that discriminates connected or linked (L(ij)=1) pairs of nodes experimentally confirmed from non-linked ones (L(ij)=0); (iii) the new model is validated with external series of pairs of nodes; (iv) the equation obtained is used to re-evaluate the connectivity quality of the network, connecting/disconnecting nodes based on the quality scores calculated with the new connectivity function. This method was used to study different types of large networks. The linear models obtained produced the following results in terms of overall accuracy for network reconstruction: Metabolic networks (72.3%), Parasite-Host networks (93.3%), CoCoMac brain cortex co-activation network (89.6%), NW Spain fasciolosis spreading network (97.2%), Spanish financial law network (89.9%) and World trade network for Intelligent & Active Food Packaging (92.8%). In order to seek these models, we studied an average of 55,388 pairs of nodes in each model and a total of 332,326 pairs of nodes in all models. Finally, this method was used to solve a more complicated problem. A model was developed to score the connectivity quality in the Drug-Target network of US FDA approved drugs. In this last model the θ(k) values were calculated for three types of molecular networks representing different levels of organization: drug molecular graphs (atom-atom bonds), protein residue networks (amino acid interactions), and drug-target network (compound-protein binding). The overall accuracy of this model was 76.3%. This work opens a new door to the computational reevaluation of network connectivity quality (collation) for complex systems in molecular, biomedical, technological, and legal-social sciences as well as in world trade and industry.

摘要

图论和复杂网络理论正在将其应用扩展到分子、生物、技术和社交网络等不同层次的物质组织。网络是一组通常称为节点的项目,节点之间存在连接,这些连接称为链接或边。根据我们要构建的网络类型,有许多不同的实验和/或理论方法来分配节点-节点链接。不幸的是,使用一种方法对整个网络进行实验重新评估在时间和资源方面非常昂贵;因此,开发更便宜的理论方法非常重要。此外,在同一类型的网络中连接节点的不同方法并不完全准确,它们并不总是一致的。在这种意义上,开发用于评估复杂网络中连接质量的计算方法(网络组装后的后验)是一个主要关注的目标。在这项工作中,我们首次报告了一种新方法,用于根据网络节点的第 k 阶马尔可夫-香农熵指数 (θ(k)) 为网络链接 L(ij) (连接性) 计算数值质量分数 S(L(ij))。该算法可以概括为以下步骤:(i)首先,为已经构建的复杂网络中的所有第 j 个节点计算 θ(k)(j) 值;(ii)使用线性判别分析 (LDA) 来寻找一个线性方程,该方程可以区分实验上确认的连接或链接(L(ij)=1)节点对与非链接(L(ij)=0)节点对;(iii)使用外部节点对系列验证新模型;(iv)使用新的连接函数计算的质量分数重新评估网络的连接质量,根据计算出的质量分数连接/断开节点。该方法用于研究不同类型的大型网络。获得的线性模型在网络重建的整体准确性方面产生了以下结果:代谢网络(72.3%)、寄生虫-宿主网络(93.3%)、CoCoMac 大脑皮层共激活网络(89.6%)、西班牙西北部 fasciolosis 传播网络(97.2%)、西班牙金融法网络(89.9%)和世界智能与主动食品包装贸易网络(92.8%)。为了找到这些模型,我们在每个模型中研究了平均 55,388 对节点,并在所有模型中总共研究了 332,326 对节点。最后,该方法用于解决更复杂的问题。开发了一种模型来评分美国 FDA 批准药物的药物-靶标网络的连接质量。在最后一个模型中,为代表不同组织层次的三种类型的分子网络计算了θ(k)值:药物分子图(原子-原子键)、蛋白质残基网络(氨基酸相互作用)和药物-靶标网络(化合物-蛋白质结合)。该模型的整体准确性为 76.3%。这项工作为计算分子、生物医学、技术和法律社会科学以及世界贸易和工业中的复杂系统的网络连接质量(整理)开辟了新的途径。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验