Suppr超能文献

利用机器学习填补饱和含氟衍生物的[此处原文缺失相关内容]及评估方面的空白。

Filling the Gap in and Evaluation for Saturated Fluorine-Containing Derivatives With Machine Learning.

作者信息

Gurbych Oleksandr, Pavliuk Petro, Krasnienkov Dmytro, Liashuk Oleksandr, Melnykov Kostiantyn, Grygorenko Oleksandr O

机构信息

Blackthorn AI Ltd., London, UK.

Department of Artificial Intelligence, Lviv Polytechnic National University, Lviv, Ukraine.

出版信息

J Comput Chem. 2025 Jan 15;46(2):e70002. doi: 10.1002/jcc.70002.

Abstract

Lipophilicity and acidity/basicity are fundamental physical properties that profoundly affect the compound's pharmacological activity, bioavailability, metabolism, and toxicity. Predicting lipophilicity, measured by (1-octanol-water distribution coefficient logarithm), and acidity/basicity, measured by (negative of acid ionization constant logarithm), is essential for early drug discovery success. However, the limited availability of experimental data and poor accuracy of standard and assessment methods for saturated fluorine-containing derivatives pose a significant challenge to achieving satisfactory results for this compound class. To overcome this challenge, we compiled a unique dataset of saturated fluorinated and corresponding non-fluorinated derivatives with and experimental values. Aiming to create an optimal approach to acidity/basicity and lipophilicity prediction, we evaluated, trained from scratch, or fine-tuned more than 40 machine learning models, including linear, tree-based, and neural networks. The study was supplemented with a substructure mask explanation (SME), which confirmed the critical role of the fluorinated substituents on both physicochemical properties studied and testified to the consistency of the developed models. The results were open-sourced as a GitHub repository, pip, conda packages, and a KNIME node, allowing the public to perform the targeted molecular design of the proposed class of compounds.

摘要

亲脂性和酸度/碱度是基本的物理性质,深刻影响化合物的药理活性、生物利用度、代谢和毒性。预测以(1-辛醇-水分配系数对数)衡量的亲脂性以及以(酸电离常数对数的负值)衡量的酸度/碱度,对于早期药物发现的成功至关重要。然而,饱和含氟衍生物的实验数据有限,且标准的酸度/碱度评估方法准确性较差,这给在这类化合物上取得满意结果带来了重大挑战。为了克服这一挑战,我们汇编了一个独特的数据集,包含饱和氟化衍生物和相应的非氟化衍生物及其酸度/碱度实验值。旨在创建一种预测酸度/碱度和亲脂性的最佳方法,我们评估、从头训练或微调了40多个机器学习模型,包括线性、基于树的和神经网络模型。该研究还辅以子结构掩码解释(SME),证实了氟化取代基在所研究的两种物理化学性质上的关键作用,并证明了所开发模型的一致性。研究结果以GitHub仓库、pip、conda包和KNIME节点的形式开源,允许公众对所提出的这类化合物进行靶向分子设计。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验