Kastrin Andrej, Rindflesch Thomas C, Hristovski Dimitar
Faculty of Information Studies, Novo mesto, Slovenia.
Lister Hill National Center for Biomedical Communications, National Library of Medicine, Bethesda, Maryland, United States of America.
PLoS One. 2014 Jul 9;9(7):e102188. doi: 10.1371/journal.pone.0102188. eCollection 2014.
Concept associations can be represented by a network that consists of a set of nodes representing concepts and a set of edges representing their relationships. Complex networks exhibit some common topological features including small diameter, high degree of clustering, power-law degree distribution, and modularity. We investigated the topological properties of a network constructed from co-occurrences between MeSH descriptors in the MEDLINE database. We conducted the analysis on two networks, one constructed from all MeSH descriptors and another using only major descriptors. Network reduction was performed using the Pearson's chi-square test for independence. To characterize topological properties of the network we adopted some specific measures, including diameter, average path length, clustering coefficient, and degree distribution. For the full MeSH network the average path length was 1.95 with a diameter of three edges and clustering coefficient of 0.26. The Kolmogorov-Smirnov test rejects the power law as a plausible model for degree distribution. For the major MeSH network the average path length was 2.63 edges with a diameter of seven edges and clustering coefficient of 0.15. The Kolmogorov-Smirnov test failed to reject the power law as a plausible model. The power-law exponent was 5.07. In both networks it was evident that nodes with a lower degree exhibit higher clustering than those with a higher degree. After simulated attack, where we removed 10% of nodes with the highest degrees, the giant component of each of the two networks contains about 90% of all nodes. Because of small average path length and high degree of clustering the MeSH network is small-world. A power-law distribution is not a plausible model for the degree distribution. The network is highly modular, highly resistant to targeted and random attack and with minimal dissortativity.
概念关联可以由一个网络来表示,该网络由一组表示概念的节点和一组表示它们之间关系的边组成。复杂网络呈现出一些共同的拓扑特征,包括小直径、高聚类度、幂律度分布和模块化。我们研究了从MEDLINE数据库中MeSH描述符的共现构建的网络的拓扑特性。我们对两个网络进行了分析,一个由所有MeSH描述符构建,另一个仅使用主要描述符。使用Pearson卡方独立性检验进行网络约简。为了表征网络的拓扑特性,我们采用了一些特定的度量,包括直径、平均路径长度、聚类系数和度分布。对于完整的MeSH网络,平均路径长度为1.95,直径为三条边,聚类系数为0.26。Kolmogorov-Smirnov检验拒绝将幂律作为度分布的合理模型。对于主要的MeSH网络,平均路径长度为2.63条边,直径为七条边,聚类系数为0.15。Kolmogorov-Smirnov检验未能拒绝将幂律作为合理模型。幂律指数为5.07。在两个网络中都很明显,度较低的节点比度较高的节点表现出更高的聚类。在模拟攻击后,我们移除了10%度最高的节点,两个网络中的每个巨组件都包含大约90%的所有节点。由于平均路径长度小和聚类度高,MeSH网络是小世界网络。幂律分布不是度分布的合理模型。该网络高度模块化,对有针对性的攻击和随机攻击具有高度抗性,且具有最小的异配性。