Suppr超能文献

PubChemQC B3LYP/6-31G*//PM6 数据集:使用 B3LYP/6-31G* 计算得到的 8600 万个分子的电子结构。

PubChemQC B3LYP/6-31G*//PM6 Data Set: The Electronic Structures of 86 Million Molecules Using B3LYP/6-31G* Calculations.

机构信息

RIKEN Cluster for Pioneering Research, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan.

Software Technology and Artificial Intelligence Research Laboratory, Chiba Institute of Technology, 2-17-1 Tsudanuma, Narashino, Chiba 275-0016, Japan.

出版信息

J Chem Inf Model. 2023 Sep 25;63(18):5734-5754. doi: 10.1021/acs.jcim.3c00899. Epub 2023 Sep 7.

Abstract

The presented "PubChemQC B3LYP/6-31G*//PM6" data set is composed of the electronic properties of 85,938,443 molecules, encompassing a broad spectrum of molecules from essential compounds to biomolecules with a molecular weight up to 1000. These molecules account for 94.0% of the original PubChem Compound catalog as of August 29, 2016. The electronic properties, including orbitals, orbital energies, total energies, dipole moments, and other pertinent properties, were computed by using the B3LYP/6-31G* and PM6 methods. The data set, available in three formats, namely, GAMESS quantum chemistry program files, selected JSON output files, and a PostgreSQL database, provides researchers with the ability to query molecular properties. It is further subdivided into five subdata sets for more specific data. The first two subsets encompass molecules with carbon, hydrogen, oxygen, and nitrogen with molecular weights under 300 and 500, respectively. The third and fourth subsets incorporate molecules with carbon, hydrogen, nitrogen, oxygen, phosphorus, sulfur, fluorine, and chlorine, with molecular weights under 300 and 500, respectively. The fifth subset comprises molecules with carbon, hydrogen, nitrogen, oxygen, phosphorus, sulfur, fluorine, chlorine, sodium, potassium, magnesium, and calcium, with a molecular weight of under 500. The coefficients of determination for the highest occupied molecular orbital-lowest unoccupied molecular orbital energy gap range from 0.892 (for CHON500) to 0.803 (for the whole data set). These comprehensive results pave the way for applications in drug discovery and materials science, among others. The data sets can be accessed under the Creative Commons Attribution 4.0 International license at the following web address: https://nakatamaho.riken.jp/pubchemqc.riken.jp/b3lyp_pm6_datasets.html.

摘要

所呈现的“PubChemQC B3LYP/6-31G*//PM6”数据集由 85938443 个分子的电子性质组成,涵盖了从基本化合物到生物分子的广泛分子范围,分子量高达 1000。这些分子占原始 PubChem 化合物目录的 94.0%,截至 2016 年 8 月 29 日。电子性质,包括轨道、轨道能量、总能量、偶极矩和其他相关性质,是使用 B3LYP/6-31G*和 PM6 方法计算的。该数据集以三种格式提供,即 GAMESS 量子化学程序文件、选定的 JSON 输出文件和 PostgreSQL 数据库,为研究人员提供了查询分子性质的能力。它进一步细分为五个子数据集,以提供更具体的数据。前两个子集分别包含分子量小于 300 和 500 的含碳、氢、氧和氮的分子。第三和第四个子集分别包含分子量小于 300 和 500 的含碳、氢、氮、氧、磷、硫、氟和氯的分子。第五个子集包含分子量小于 500 的含碳、氢、氮、氧、磷、硫、氟、氯、钠、钾、镁和钙的分子。最高占据分子轨道-最低未占据分子轨道能量间隙的决定系数范围从 0.892(对于 CHON500)到 0.803(对于整个数据集)。这些综合结果为药物发现和材料科学等领域的应用铺平了道路。这些数据集可以在以下网址下以知识共享署名 4.0 国际许可证获得:https://nakatamaho.riken.jp/pubchemqc.riken.jp/b3lyp_pm6_datasets.html。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验