Suppr超能文献

用于δ机器学习的氯化多环芳烃的量子化学性质

Quantum chemical properties of chlorinated polycyclic aromatic hydrocarbons for delta machine learning.

作者信息

Frolov Dmitry, Ibraev Ilya, Sedov Igor

机构信息

Sirius University of Science and Technology, 1 Olympic Ave, 354340, Sirius, Russia.

Chemical Institute, Kazan Federal University, Kremlevskaya 18, 420008, Kazan, Russia.

出版信息

Sci Data. 2025 Jun 21;12(1):1059. doi: 10.1038/s41597-025-05383-0.

Abstract

Promising Δ-machine learning approaches aim to correct the values of molecular properties calculated using computationally inexpensive approaches to the accuracy of precise methods. Training such models requires data obtained at different levels of quantum chemical theory. While several large and chemically diverse datasets have been published, studies in many areas require specialized datasets for structurally related molecules. Chlorinated polycyclic aromatic hydrocarbons (Cl-PAHs), the products of incomplete combustion of organic substances and materials, are hazardous pollutants with carcinogenic and mutagenic activity. Quantum chemistry methods are important to understand their formation mechanisms and properties. We describe a dataset, PACHQA, containing geometries, physical properties, wavefunctions, and electron densities for 3551 molecules including 3417 Cl-PAHs with up to 6 rings and different number of chlorine atoms, and 134 parent hydrocarbons. Most of them were not included in any published quantum chemical datasets. The calculations were performed at three different levels of theory. The dataset may be useful to develop and validate computational, machine learning, or experimental approaches, and study the structure-property relationships for Cl-PAHs.

摘要

有前景的Δ机器学习方法旨在将使用计算成本较低的方法计算出的分子性质值校正到精确方法的精度。训练此类模型需要不同量子化学理论水平下获得的数据。虽然已经发表了几个大型且化学性质多样的数据集,但许多领域的研究需要针对结构相关分子的专门数据集。氯化多环芳烃(Cl-PAHs)是有机物质和材料不完全燃烧的产物,是具有致癌和致突变活性的有害污染物。量子化学方法对于理解它们的形成机制和性质很重要。我们描述了一个数据集PACHQA,其中包含3551个分子的几何结构、物理性质、波函数和电子密度,包括3417个最多含6个环且氯原子数量不同的Cl-PAHs以及134个母体烃。其中大多数未包含在任何已发表的量子化学数据集中。计算是在三种不同的理论水平上进行的。该数据集可能有助于开发和验证计算、机器学习或实验方法,并研究Cl-PAHs的结构-性质关系。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4813/12182570/b23c808944a2/41597_2025_5383_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验