Suppr超能文献

用于探索大型类药物分子构象和溶剂效应的量子力学数据集。

Dataset for quantum-mechanical exploration of conformers and solvent effects in large drug-like molecules.

机构信息

Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg.

Institute for Materials Science and Max Bergmann Center of Biomaterials, TU Dresden, 01062, Dresden, Germany.

出版信息

Sci Data. 2024 Jul 7;11(1):742. doi: 10.1038/s41597-024-03521-8.

Abstract

We here introduce the Aquamarine (AQM) dataset, an extensive quantum-mechanical (QM) dataset that contains the structural and electronic information of 59,783 low-and high-energy conformers of 1,653 molecules with a total number of atoms ranging from 2 to 92 (mean: 50.9), and containing up to 54 (mean: 28.2) non-hydrogen atoms. To gain insights into the solvent effects as well as collective dispersion interactions for drug-like molecules, we have performed QM calculations supplemented with a treatment of many-body dispersion (MBD) interactions of structures and properties in the gas phase and implicit water. Thus, AQM contains over 40 global and local physicochemical properties (including ground-state and response properties) per conformer computed at the tightly converged PBE0+MBD level of theory for gas-phase molecules, whereas PBE0+MBD with the modified Poisson-Boltzmann (MPB) model of water was used for solvated molecules. By addressing both molecule-solvent and dispersion interactions, AQM dataset can serve as a challenging benchmark for state-of-the-art machine learning methods for property modeling and de novo generation of large (solvated) molecules with pharmaceutical and biological relevance.

摘要

我们在此介绍 Aquamarine (AQM) 数据集,这是一个广泛的量子力学 (QM) 数据集,其中包含了 1653 种分子的 59783 个低能和高能构象的结构和电子信息,这些分子的原子总数从 2 到 92(平均值:50.9)不等,其中最多包含 54 个(平均值:28.2)非氢原子。为了深入了解药物样分子的溶剂效应和集体色散相互作用,我们对结构和性质进行了量子力学计算,并补充了许多体色散(MBD)相互作用的处理,这些计算在气相和隐式水中进行。因此,AQM 包含了超过 40 种全局和局部物理化学性质(包括基态和响应性质),每个构象都是在气相分子的紧密收敛的 PBE0+MBD 理论水平上计算的,而对于溶剂化分子,则使用了改进的泊松-玻尔兹曼(MPB)模型的 PBE0+MBD。通过解决分子-溶剂和色散相互作用的问题,AQM 数据集可以作为具有药物和生物学相关性的大型(溶剂化)分子的性质建模和从头生成的最先进机器学习方法的挑战性基准。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7adb/11228031/650b3943f6e1/41597_2024_3521_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验