• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种推断具有所需水溶性的化合物的统一方法。

A unified approach to inferring chemical compounds with the desired aqueous solubility.

作者信息

Batool Muniba, Azam Naveed Ahmed, Zhu Jianshen, Haraguchi Kazuya, Zhao Liang, Akutsu Tatsuya

机构信息

Discrete Mathematics and Computational Intelligence Laboratory, Department of Mathematics, Quaid-i-Azam University, Islamabad, Pakistan.

Discrete Mathematics Laboratory, Department of Applied Mathematics and Physics, Graduate School of Informatics, Kyoto University, 606-8501, Kyoto, Japan.

出版信息

J Cheminform. 2025 Mar 26;17(1):37. doi: 10.1186/s13321-025-00966-w.

DOI:10.1186/s13321-025-00966-w
PMID:40140978
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11938699/
Abstract

Aqueous solubility (AS) is a key physiochemical property that plays a crucial role in drug discovery and material design. We report a novel unified approach to predict and infer chemical compounds with the desired AS based on simple deterministic graph-theoretic descriptors, multiple linear regression (MLR), and mixed integer linear programming (MILP). Selected descriptors based on a forward stepwise procedure enabled the simplest regression model, MLR, to achieve significantly good prediction accuracy compared to the existing approaches, achieving accuracy in the range [0.7191, 0.9377] for 29 diverse datasets. By simulating these descriptors and learning models as MILPs, we inferred mathematically exact and optimal compounds with the desired AS, prescribed structures, and up to 50 non-hydrogen atoms in a reasonable time range [6, 1166] seconds. These findings indicate a strong correlation between the simple graph-theoretic descriptors and the AS of compounds, potentially leading to a deeper understanding of their AS without relying on widely used complicated chemical descriptors and complex machine learning models that are computationally expensive, and therefore difficult to use for inference. An implementation of the proposed approach is available at  https://github.com/ku-dml/mol-infer/tree/master/AqSol .

摘要

水溶性(AS)是一种关键的物理化学性质,在药物发现和材料设计中起着至关重要的作用。我们报告了一种新颖的统一方法,该方法基于简单的确定性图论描述符、多元线性回归(MLR)和混合整数线性规划(MILP)来预测和推断具有所需水溶性的化合物。基于逐步向前法选择的描述符使最简单的回归模型MLR与现有方法相比能够实现显著良好的预测准确性,对于29个不同的数据集,其准确率在[0.7191, 0.9377]范围内。通过将这些描述符和学习模型模拟为MILP,我们在合理的时间范围[6, 1166]秒内推断出具有所需水溶性、规定结构且最多50个非氢原子的数学精确且最优的化合物。这些发现表明简单的图论描述符与化合物的水溶性之间存在很强的相关性,这可能会在不依赖广泛使用的复杂化学描述符和计算成本高昂因而难以用于推断的复杂机器学习模型的情况下,更深入地理解它们的水溶性。所提出方法的实现可在https://github.com/ku-dml/mol-infer/tree/master/AqSol获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bda/11938699/44affa9d154f/13321_2025_966_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bda/11938699/aaccf85601ca/13321_2025_966_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bda/11938699/0b3fbbb289ee/13321_2025_966_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bda/11938699/0a19f267adc8/13321_2025_966_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bda/11938699/674b753e05f0/13321_2025_966_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bda/11938699/0d1d7ddb65ed/13321_2025_966_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bda/11938699/44affa9d154f/13321_2025_966_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bda/11938699/aaccf85601ca/13321_2025_966_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bda/11938699/0b3fbbb289ee/13321_2025_966_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bda/11938699/0a19f267adc8/13321_2025_966_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bda/11938699/674b753e05f0/13321_2025_966_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bda/11938699/0d1d7ddb65ed/13321_2025_966_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bda/11938699/44affa9d154f/13321_2025_966_Figa_HTML.jpg

相似文献

1
A unified approach to inferring chemical compounds with the desired aqueous solubility.一种推断具有所需水溶性的化合物的统一方法。
J Cheminform. 2025 Mar 26;17(1):37. doi: 10.1186/s13321-025-00966-w.
2
Quadratic descriptors and reduction methods in a two-layered model for compound inference.用于复合推理的双层模型中的二次描述符和约简方法。
Front Genet. 2025 Jan 29;15:1483490. doi: 10.3389/fgene.2024.1483490. eCollection 2024.
3
An Inverse QSAR Method Based on Linear Regression and Integer Programming.基于线性回归和整数规划的逆定量构效关系方法。
Front Biosci (Landmark Ed). 2022 Jun 10;27(6):188. doi: 10.31083/j.fbl2706188.
4
A novel method for inference of acyclic chemical compounds with bounded branch-height based on artificial neural networks and integer programming.一种基于人工神经网络和整数规划的推断具有有界分支高度的无环化合物的新方法。
Algorithms Mol Biol. 2021 Aug 14;16(1):18. doi: 10.1186/s13015-021-00197-2.
5
Prediction and application in QSPR of aqueous solubility of sulfur-containing aromatic esters using GA-based MLR with quantum descriptors.基于遗传算法的多元线性回归结合量子描述符对含硫芳香酯类化合物水溶性的定量构效关系预测及应用
Water Res. 2002 Jul;36(12):2975-82. doi: 10.1016/s0043-1354(01)00532-2.
6
QSPR prediction of aqueous solubility of drug-like organic compounds.类药物有机化合物水溶性的定量构效关系预测
Chem Pharm Bull (Tokyo). 2007 Apr;55(4):669-74. doi: 10.1248/cpb.55.669.
7
A Novel Method for Inferring Chemical Compounds With Prescribed Topological Substructures Based on Integer Programming.一种基于整数规划推断具有规定拓扑子结构的化合物的新方法。
IEEE/ACM Trans Comput Biol Bioinform. 2022 Nov-Dec;19(6):3233-3245. doi: 10.1109/TCBB.2021.3112598. Epub 2022 Dec 8.
8
Genetic Algorithm and Self-Organizing Maps for QSPR Study of Some N-aryl Derivatives as Butyrylcholinesterase Inhibitors.用于某些N-芳基衍生物作为丁酰胆碱酯酶抑制剂的定量构效关系研究的遗传算法和自组织映射
Curr Drug Discov Technol. 2016;13(4):232-253. doi: 10.2174/1570163813666160725114241.
9
A Method for Inferring Polymers Based on Linear Regression and Integer Programming.一种基于线性回归和整数规划的聚合物推断方法。
IEEE/ACM Trans Comput Biol Bioinform. 2024 Nov-Dec;21(6):1623-1632. doi: 10.1109/TCBB.2024.3447780. Epub 2024 Dec 10.
10
Comparative Analysis of Chemical Descriptors by Machine Learning Reveals Atomistic Insights into Solute-Lipid Interactions.基于机器学习的化学描述符对比分析揭示了溶质-脂质相互作用的原子水平见解。
Mol Pharm. 2024 Jul 1;21(7):3343-3355. doi: 10.1021/acs.molpharmaceut.4c00080. Epub 2024 May 23.

本文引用的文献

1
A Method for Inferring Polymers Based on Linear Regression and Integer Programming.一种基于线性回归和整数规划的聚合物推断方法。
IEEE/ACM Trans Comput Biol Bioinform. 2024 Nov-Dec;21(6):1623-1632. doi: 10.1109/TCBB.2024.3447780. Epub 2024 Dec 10.
2
Prediction of organic compound aqueous solubility using machine learning: a comparison study of descriptor-based and fingerprints-based models.使用机器学习预测有机化合物的水溶性:基于描述符和基于指纹的模型的比较研究
J Cheminform. 2023 Oct 18;15(1):99. doi: 10.1186/s13321-023-00752-6.
3
Building Machine Learning Small Molecule Melting Points and Solubility Models Using CCDC Melting Points Dataset.
使用 CCDC 熔点数据集构建机器学习小分子熔点和溶解度模型。
J Chem Inf Model. 2023 May 22;63(10):2948-2959. doi: 10.1021/acs.jcim.3c00308. Epub 2023 May 1.
4
Transparency in Modeling through Careful Application of OECD's QSAR/QSPR Principles via a Curated Water Solubility Data Set.通过精心应用经合组织的 QSAR/QSPR 原则并通过精心制作的水溶性数据集实现建模透明度。
Chem Res Toxicol. 2023 Mar 20;36(3):465-478. doi: 10.1021/acs.chemrestox.2c00379. Epub 2023 Mar 6.
5
Blinded Predictions and Post Hoc Analysis of the Second Solubility Challenge Data: Exploring Training Data and Feature Set Selection for Machine and Deep Learning Models.盲法预测和事后分析第二次溶解度挑战数据:探索机器学习和深度学习模型的训练数据和特征集选择。
J Chem Inf Model. 2023 Feb 27;63(4):1099-1113. doi: 10.1021/acs.jcim.2c01189. Epub 2023 Feb 9.
6
Prediction of the Aqueous Solubility of Compounds Based on Light Gradient Boosting Machines with Molecular Fingerprints and the Cuckoo Search Algorithm.基于带有分子指纹和布谷鸟搜索算法的轻梯度提升机预测化合物的水溶性
ACS Omega. 2022 Nov 8;7(46):42027-42035. doi: 10.1021/acsomega.2c03885. eCollection 2022 Nov 22.
7
Accurate Physical Property Predictions via Deep Learning.通过深度学习进行准确的物理性质预测。
Molecules. 2022 Mar 3;27(5):1668. doi: 10.3390/molecules27051668.
8
Boosting the predictive performance with aqueous solubility dataset curation.通过对水溶解度数据集的整理来提高预测性能。
Sci Data. 2022 Mar 3;9(1):71. doi: 10.1038/s41597-022-01154-3.
9
A novel method for inference of acyclic chemical compounds with bounded branch-height based on artificial neural networks and integer programming.一种基于人工神经网络和整数规划的推断具有有界分支高度的无环化合物的新方法。
Algorithms Mol Biol. 2021 Aug 14;16(1):18. doi: 10.1186/s13015-021-00197-2.
10
SolTranNet-A Machine Learning Tool for Fast Aqueous Solubility Prediction.SolTranNet:一种用于快速预测水溶解度的机器学习工具。
J Chem Inf Model. 2021 Jun 28;61(6):2530-2536. doi: 10.1021/acs.jcim.1c00331. Epub 2021 May 26.