Suppr超能文献

高质量的蛋白质结合配体构象数据集及其在构象混合生成器基准测试中的应用。

High-Quality Dataset of Protein-Bound Ligand Conformations and Its Application to Benchmarking Conformer Ensemble Generators.

机构信息

University of Hamburg , ZBH - Center for Bioinformatics, Bundesstraße 43, Hamburg 20146, Germany.

出版信息

J Chem Inf Model. 2017 Mar 27;57(3):529-539. doi: 10.1021/acs.jcim.6b00613. Epub 2017 Feb 16.

Abstract

We developed a cheminformatics pipeline for the fully automated selection and extraction of high-quality protein-bound ligand conformations from X-ray structural data. The pipeline evaluates the validity and accuracy of the 3D structures of small molecules according to multiple criteria, including their fit to the electron density and their physicochemical and structural properties. Using this approach, we compiled two high-quality datasets from the Protein Data Bank (PDB): a comprehensive dataset and a diversified subset of 4626 and 2912 structures, respectively. The datasets were applied to benchmarking seven freely available conformer ensemble generators: Balloon (two different algorithms), the RDKit standard conformer ensemble generator, the Experimental-Torsion basic Knowledge Distance Geometry (ETKDG) algorithm, Confab, Frog2 and Multiconf-DOCK. Substantial differences in the performance of the individual algorithms were observed, with RDKit and ETKDG generally achieving a favorable balance of accuracy, ensemble size and runtime. The Platinum datasets are available for download from http://www.zbh.uni-hamburg.de/platinum_dataset .

摘要

我们开发了一个化学信息学管道,用于从 X 射线结构数据中全自动选择和提取高质量的蛋白结合配体构象。该管道根据多个标准评估小分子 3D 结构的有效性和准确性,包括它们与电子密度的拟合程度以及它们的物理化学和结构特性。使用这种方法,我们从蛋白质数据库(PDB)中编译了两个高质量的数据集:一个综合数据集和一个包含 4626 和 2912 个结构的多样化子集。这些数据集用于基准测试七个免费的构象集合生成器:Balloon(两种不同的算法)、RDKit 标准构象集合生成器、Experimental-Torsion basic Knowledge Distance Geometry(ETKDG)算法、Confab、Frog2 和 Multiconf-DOCK。观察到各个算法的性能存在很大差异,RDKit 和 ETKDG 通常在准确性、集合大小和运行时间之间取得了有利的平衡。Platinum 数据集可从 http://www.zbh.uni-hamburg.de/platinum_dataset 下载。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验