通过比对循环实现大型化学数据库的快速三维形状筛选

Fast 3D shape screening of large chemical databases through alignment-recycling.

作者信息

Fontaine Fabien, Bolton Evan, Borodina Yulia, Bryant Stephen H

机构信息

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, MD 20894, USA.

出版信息

Chem Cent J. 2007 Jun 6;1:12. doi: 10.1186/1752-153X-1-12.

DOI:10.1186/1752-153X-1-12

PMID:17880744

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1994057/

Abstract

BACKGROUND

Large chemical databases require fast, efficient, and simple ways of looking for similar structures. Although such tasks are now fairly well resolved for graph-based similarity queries, they remain an issue for 3D approaches, particularly for those based on 3D shape overlays. Inspired by a recent technique developed to compare molecular shapes, we designed a hybrid methodology, alignment-recycling, that enables efficient retrieval and alignment of structures with similar 3D shapes.

RESULTS

Using a dataset of more than one million PubChem compounds of limited size (< 28 heavy atoms) and flexibility (< 6 rotatable bonds), we obtained a set of a few thousand diverse structures covering entirely the 3D shape space of the conformers of the dataset. Transformation matrices gathered from the overlays between these diverse structures and the 3D conformer dataset allowed us to drastically (100-fold) reduce the CPU time required for shape overlay. The alignment-recycling heuristic produces results consistent with de novo alignment calculation, with better than 80% hit list overlap on average.

CONCLUSION

Overlay-based 3D methods are computationally demanding when searching large databases. Alignment-recycling reduces the CPU time to perform shape similarity searches by breaking the alignment problem into three steps: selection of diverse shapes to describe the database shape-space; overlay of the database conformers to the diverse shapes; and non-optimized overlay of query and database conformers using common reference shapes. The precomputation, required by the first two steps, is a significant cost of the method; however, once performed, querying is two orders of magnitude faster. Extensions and variations of this methodology, for example, to handle more flexible and larger small-molecules are discussed.

摘要

背景

大型化学数据库需要快速、高效且简单的方法来查找相似结构。尽管基于图形的相似性查询这类任务如今已得到较好解决，但对于三维方法而言，它们仍然是个问题，尤其是对于那些基于三维形状叠加的方法。受最近开发的一种用于比较分子形状的技术启发，我们设计了一种混合方法——比对循环法，该方法能够高效检索和比对具有相似三维形状的结构。

结果

使用一个包含超过一百万个PubChem化合物的数据集，这些化合物尺寸有限（<28个重原子）且柔性较低（<6个可旋转键），我们获得了几千个不同的结构，这些结构完全覆盖了数据集中构象异构体的三维形状空间。从这些不同结构与三维构象异构体数据集之间的叠加中收集的变换矩阵，使我们能够大幅（100倍）减少形状叠加所需的CPU时间。比对循环启发式算法产生的结果与从头比对计算一致，平均命中列表重叠率超过80%。

结论

在搜索大型数据库时，基于叠加的三维方法计算量很大。比对循环法通过将比对问题分解为三个步骤来减少执行形状相似性搜索所需的CPU时间：选择不同形状以描述数据库形状空间；将数据库构象异构体与不同形状进行叠加；以及使用共同参考形状对查询和数据库构象异构体进行非优化叠加。前两个步骤所需的预计算是该方法的一项重大成本；然而，一旦完成预计算，查询速度会快两个数量级。本文还讨论了该方法的扩展和变体，例如用于处理更具柔性和更大的小分子。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18dc/1994057/10e7120ef86d/1752-153X-1-12-1.jpg

相似文献

Fast 3D shape screening of large chemical databases through alignment-recycling.通过比对循环实现大型化学数据库的快速三维形状筛选

Chem Cent J. 2007 Jun 6;1:12. doi: 10.1186/1752-153X-1-12.

J Cheminform. 2011 May 9;3:13. doi: 10.1186/1758-2946-3-13.

SENSAAS-Flex: a joint optimization approach for aligning 3D shapes and exploring the molecular conformation space.SENSAAS-Flex：一种联合优化方法，用于对齐 3D 形状并探索分子构象空间。

Bioinformatics. 2024 Mar 4;40(3). doi: 10.1093/bioinformatics/btae105.

Stereoselective virtual screening of the ZINC database using atom pair 3D-fingerprints.使用原子对三维指纹对ZINC数据库进行立体选择性虚拟筛选。

J Cheminform. 2015 Feb 10;7:3. doi: 10.1186/s13321-014-0051-5. eCollection 2015.

Unconventional 2D shape similarity method affords comparable enrichment as a 3D shape method in virtual screening experiments.在虚拟筛选实验中，非常规二维形状相似性方法与三维形状方法具有相当的富集效果。

J Chem Inf Model. 2009 Jun;49(6):1313-20. doi: 10.1021/ci900015b.

PubChem3D: Shape compatibility filtering using molecular shape quadrupoles.PubChem3D：使用分子形状四极矩进行形状兼容性过滤。

J Cheminform. 2011 Jul 20;3:25. doi: 10.1186/1758-2946-3-25.

Fragment oriented molecular shapes.片段导向的分子形状

J Mol Graph Model. 2016 May;66:143-54. doi: 10.1016/j.jmgm.2016.03.017. Epub 2016 Apr 2.

On the relevance of query definition in the performance of 3D ligand-based virtual screening.在 3D 基于配体的虚拟筛选性能中查询定义的相关性。

J Comput Aided Mol Des. 2024 Apr 4;38(1):18. doi: 10.1007/s10822-024-00561-5.

Protein structure alignment and fast similarity search using local shape signatures.使用局部形状特征进行蛋白质结构比对和快速相似性搜索。

J Bioinform Comput Biol. 2004 Mar;2(1):215-39. doi: 10.1142/s0219720004000533.

SHAFTS: a hybrid approach for 3D molecular similarity calculation. 1. Method and assessment of virtual screening.SHAFTS：一种用于 3D 分子相似性计算的混合方法。1. 虚拟筛选的方法和评估。

J Chem Inf Model. 2011 Sep 26;51(9):2372-85. doi: 10.1021/ci200060s. Epub 2011 Aug 25.

引用本文的文献

PyComp: A Versatile Tool for Efficient Data Extraction, Conversion, and Management in High-throughput Virtual Drug Screening.PyComp：一种用于高通量虚拟药物筛选中高效数据提取、转换和管理的通用工具。

Curr Comput Aided Drug Des. 2025;21(4):479-486. doi: 10.2174/0115734099274495231218150611.

Natural Algaecide Sphingosines Identified in Hybrid Straw Decomposition Driven by White-Rot Fungi.天然杀藻物质神经酰胺在白腐真菌驱动下的混合秸秆分解中被发现。

Adv Sci (Weinh). 2023 Sep;10(25):e2300569. doi: 10.1002/advs.202300569. Epub 2023 Jul 3.

"Canopy fingerprints" for characterizing three-dimensional point cloud data of soybean canopies.用于表征大豆冠层三维点云数据的“冠层指纹”

Front Plant Sci. 2023 Mar 29;14:1141153. doi: 10.3389/fpls.2023.1141153. eCollection 2023.

Efficient virtual high-content screening using a distance-aware transformer model.使用距离感知变压器模型进行高效虚拟高内涵筛选。

J Cheminform. 2023 Feb 8;15(1):18. doi: 10.1186/s13321-023-00686-z.

Applications of Virtual Screening in Bioprospecting: Facts, Shifts, and Perspectives to Explore the Chemo-Structural Diversity of Natural Products.虚拟筛选在生物勘探中的应用：探索天然产物化学结构多样性的事实、转变与展望

Front Chem. 2021 Apr 29;9:662688. doi: 10.3389/fchem.2021.662688. eCollection 2021.

Teaching an Old Dog New Tricks: Strategies That Improve Early Recognition in Similarity-Based Virtual Screening.老狗学新招：提高基于相似性虚拟筛选早期识别能力的策略

Front Chem. 2019 Oct 23;7:701. doi: 10.3389/fchem.2019.00701. eCollection 2019.

SPIDR: small-molecule peptide-influenced drug repurposing.SPIDR：小分子肽影响的药物再利用。

BMC Bioinformatics. 2018 Apr 16;19(1):138. doi: 10.1186/s12859-018-2153-y.

Virtual Screening Approaches towards the Discovery of Toll-Like Receptor Modulators.用于发现Toll样受体调节剂的虚拟筛选方法

Int J Mol Sci. 2016 Sep 9;17(9):1508. doi: 10.3390/ijms17091508.

Fragment oriented molecular shapes.片段导向的分子形状

J Mol Graph Model. 2016 May;66:143-54. doi: 10.1016/j.jmgm.2016.03.017. Epub 2016 Apr 2.

Target enhanced 2D similarity search by using explicit biological activity annotations and profiles.通过使用明确的生物活性注释和概况来进行目标增强二维相似性搜索。

J Cheminform. 2015 Nov 17;7:55. doi: 10.1186/s13321-015-0103-5. eCollection 2015.

本文引用的文献

MMFF VI. MMFF94s option for energy minimization studies.MMFF VI。用于能量最小化研究的MMFF94s选项。

J Comput Chem. 1999 May;20(7):720-729. doi: 10.1002/(SICI)1096-987X(199905)20:7<720::AID-JCC7>3.0.CO;2-X.

Scaffold hopping using clique detection applied to reduced graphs.使用团检测应用于简化图的支架跳跃。

J Chem Inf Model. 2006 Mar-Apr;46(2):503-11. doi: 10.1021/ci050347r.

The use of three-dimensional shape and electrostatic similarity searching in the identification of a melanin-concentrating hormone receptor 1 antagonist.利用三维形状和静电相似性搜索来鉴定促黑素聚集激素受体1拮抗剂。

Chem Biol Drug Des. 2006 Feb;67(2):174-6. doi: 10.1111/j.1747-0285.2006.00341.x.

Database resources of the National Center for Biotechnology Information.美国国立生物技术信息中心的数据库资源。

Nucleic Acids Res. 2006 Jan 1;34(Database issue):D173-80. doi: 10.1093/nar/gkj158.

Small molecule shape-fingerprints.小分子形状指纹图谱。

J Chem Inf Model. 2005 May-Jun;45(3):673-84. doi: 10.1021/ci049651v.

Molecular-modeling based design, synthesis, and activity of substituted piperidines as gamma-secretase inhibitors.基于分子建模的取代哌啶类γ-分泌酶抑制剂的设计、合成及活性研究

Bioorg Med Chem Lett. 2005 Apr 1;15(7):1891-4. doi: 10.1016/j.bmcl.2005.02.006.

A shape-based 3-D scaffold hopping method and its application to a bacterial protein-protein interaction.一种基于形状的三维骨架跳跃方法及其在细菌蛋白质-蛋白质相互作用中的应用。

J Med Chem. 2005 Mar 10;48(5):1489-95. doi: 10.1021/jm040163o.

Measuring CAMD technique performance: a virtual screening case study in the design of validation experiments.测量CAMD技术性能：验证实验设计中的虚拟筛选案例研究

J Comput Aided Mol Des. 2004 Jul-Sep;18(7-9):529-36. doi: 10.1007/s10822-004-4067-1.

Org Biomol Chem. 2004 Nov 21;2(22):3204-18. doi: 10.1039/B409813G. Epub 2004 Oct 14.

Curr Top Med Chem. 2004;4(6):589-600. doi: 10.2174/1568026043451186.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过比对循环实现大型化学数据库的快速三维形状筛选

Fast 3D shape screening of large chemical databases through alignment-recycling.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献