Velkoborsky Jakub, Hoksza David
Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic.
J Cheminform. 2016 Dec 29;8:74. doi: 10.1186/s13321-016-0186-7. eCollection 2016.
Visualization of large molecular datasets is a challenging yet important topic utilised in diverse fields of chemistry ranging from material engineering to drug design. Especially in drug design, modern methods of high-throughput screening generate large amounts of molecular data that call for methods enabling their analysis. One such method is classification of compounds based on their molecular scaffolds, a concept widely used by medicinal chemists to group molecules of similar properties. This classification can then be utilized for intuitive visualization of compounds.
In this paper, we propose a scaffold hierarchy as a result of large-scale analysis of the PubChem Compound database. The analysis not only provided insights into scaffold diversity of the PubChem Compound database, but also enables scaffold-based hierarchical visualization of user compound data sets on the background of empirical chemical space, as defined by the PubChem data, or on the background of any other user-defined data set. The visualization is performed by a web based client-server application called Scaffvis. It provides an interactive zoomable tree map visualization of data sets up to hundreds of thousands molecules. Scaffvis is free to use and its source codes have been published under an open source license.Graphical abstract.
大分子数据集的可视化是一个具有挑战性但又很重要的课题,在从材料工程到药物设计等不同化学领域都有应用。特别是在药物设计中,现代高通量筛选方法会产生大量分子数据,这就需要能够对其进行分析的方法。其中一种方法是基于分子骨架对化合物进行分类,这是药物化学家广泛用于对具有相似性质的分子进行分组的概念。然后这种分类可用于化合物的直观可视化。
在本文中,我们通过对PubChem化合物数据库进行大规模分析,提出了一种骨架层次结构。该分析不仅深入了解了PubChem化合物数据库的骨架多样性,还能在由PubChem数据定义的经验化学空间背景下,或在任何其他用户定义数据集的背景下,对用户化合物数据集进行基于骨架的层次可视化。可视化由一个名为Scaffvis的基于网络的客户端 - 服务器应用程序执行。它提供了一个交互式的可缩放树形图可视化,可处理多达数十万分子的数据集。Scaffvis可免费使用,其源代码已根据开源许可发布。图形摘要。