Zwierzyna Magdalena, Vogt Martin, Maggiora Gerald M, Bajorath Jürgen
B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Department of Life Science Informatics, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr. 2, 53113, Bonn, Germany.
J Comput Aided Mol Des. 2015 Feb;29(2):113-25. doi: 10.1007/s10822-014-9821-4. Epub 2014 Dec 3.
Chemical Space Networks (CSNs) are generated for different compound data sets on the basis of pairwise similarity relationships. Such networks are thought to complement and further extend traditional coordinate-based views of chemical space. Our proof-of-concept study focuses on CSNs based upon fingerprint similarity relationships calculated using the conventional Tanimoto similarity metric. The resulting CSNs are characterized with statistical measures from network science and compared in different ways. We show that the homophily principle, which is widely considered in the context of social networks, is a major determinant of the topology of CSNs of bioactive compounds, designed as threshold networks, typically giving rise to community structures. Many properties of CSNs are influenced by numerical features of the conventional Tanimoto similarity metric and largely dominated by the edge density of the networks, which depends on chosen similarity threshold values. However, properties of different CSNs with constant edge density can be directly compared, revealing systematic differences between CSNs generated from randomly collected or bioactive compounds.
化学空间网络(CSN)是基于成对相似性关系为不同的化合物数据集生成的。这类网络被认为是对传统基于坐标的化学空间观点的补充和进一步扩展。我们的概念验证研究聚焦于基于使用传统Tanimoto相似性度量计算出的指纹相似性关系的CSN。所得的CSN用来自网络科学的统计量进行表征,并以不同方式进行比较。我们表明,在社交网络背景下被广泛考虑的同配原则,是设计为阈值网络的生物活性化合物CSN拓扑结构的主要决定因素,通常会产生群落结构。CSN的许多属性受传统Tanimoto相似性度量的数值特征影响,并且在很大程度上由网络的边密度主导,而边密度取决于所选的相似性阈值。然而,具有恒定边密度的不同CSN的属性可以直接比较,揭示从随机收集的化合物或生物活性化合物生成的CSN之间的系统差异。