利用机器学习填补饱和含氟衍生物的[此处原文缺失相关内容]及评估方面的空白。

Filling the Gap in and Evaluation for Saturated Fluorine-Containing Derivatives With Machine Learning.

作者信息

Gurbych Oleksandr, Pavliuk Petro, Krasnienkov Dmytro, Liashuk Oleksandr, Melnykov Kostiantyn, Grygorenko Oleksandr O

机构信息

Blackthorn AI Ltd., London, UK.

Department of Artificial Intelligence, Lviv Polytechnic National University, Lviv, Ukraine.

出版信息

J Comput Chem. 2025 Jan 15;46(2):e70002. doi: 10.1002/jcc.70002.

DOI:10.1002/jcc.70002

PMID:39803824

Abstract

Lipophilicity and acidity/basicity are fundamental physical properties that profoundly affect the compound's pharmacological activity, bioavailability, metabolism, and toxicity. Predicting lipophilicity, measured by (1-octanol-water distribution coefficient logarithm), and acidity/basicity, measured by (negative of acid ionization constant logarithm), is essential for early drug discovery success. However, the limited availability of experimental data and poor accuracy of standard and assessment methods for saturated fluorine-containing derivatives pose a significant challenge to achieving satisfactory results for this compound class. To overcome this challenge, we compiled a unique dataset of saturated fluorinated and corresponding non-fluorinated derivatives with and experimental values. Aiming to create an optimal approach to acidity/basicity and lipophilicity prediction, we evaluated, trained from scratch, or fine-tuned more than 40 machine learning models, including linear, tree-based, and neural networks. The study was supplemented with a substructure mask explanation (SME), which confirmed the critical role of the fluorinated substituents on both physicochemical properties studied and testified to the consistency of the developed models. The results were open-sourced as a GitHub repository, pip, conda packages, and a KNIME node, allowing the public to perform the targeted molecular design of the proposed class of compounds.

摘要

亲脂性和酸度/碱度是基本的物理性质，深刻影响化合物的药理活性、生物利用度、代谢和毒性。预测以（1-辛醇-水分配系数对数）衡量的亲脂性以及以（酸电离常数对数的负值）衡量的酸度/碱度，对于早期药物发现的成功至关重要。然而，饱和含氟衍生物的实验数据有限，且标准的酸度/碱度评估方法准确性较差，这给在这类化合物上取得满意结果带来了重大挑战。为了克服这一挑战，我们汇编了一个独特的数据集，包含饱和氟化衍生物和相应的非氟化衍生物及其酸度/碱度实验值。旨在创建一种预测酸度/碱度和亲脂性的最佳方法，我们评估、从头训练或微调了40多个机器学习模型，包括线性、基于树的和神经网络模型。该研究还辅以子结构掩码解释（SME），证实了氟化取代基在所研究的两种物理化学性质上的关键作用，并证明了所开发模型的一致性。研究结果以GitHub仓库、pip、conda包和KNIME节点的形式开源，允许公众对所提出的这类化合物进行靶向分子设计。

相似文献

Filling the Gap in and Evaluation for Saturated Fluorine-Containing Derivatives With Machine Learning.利用机器学习填补饱和含氟衍生物的[此处原文缺失相关内容]及评估方面的空白。

J Comput Chem. 2025 Jan 15;46(2):e70002. doi: 10.1002/jcc.70002.

Mono- and Difluorinated Saturated Heterocyclic Amines for Drug Discovery: Systematic Study of Their Physicochemical Properties.单氟和双氟饱和杂环胺类化合物在药物研发中的应用：理化性质的系统研究。

Chemistry. 2023 Aug 21;29(47):e202301383. doi: 10.1002/chem.202301383. Epub 2023 Jul 26.

A deep learning approach for the blind logP prediction in SAMPL6 challenge.一种用于 SAMPL6 挑战赛中盲 logP 预测的深度学习方法。

J Comput Aided Mol Des. 2020 May;34(5):535-542. doi: 10.1007/s10822-020-00292-3. Epub 2020 Jan 30.

Impact of Fluoroalkyl Substituents on the Physicochemical Properties of Saturated Heterocyclic Amines.氟烷基取代基对饱和杂环胺物理化学性质的影响。

Chemistry. 2022 Oct 4;28(55):e202201601. doi: 10.1002/chem.202201601. Epub 2022 Aug 1.

Fluorinated Cycloalkyl Building Blocks for Drug Discovery.氟代环烷基砌块在药物发现中的应用

ChemMedChem. 2022 Nov 4;17(21):e202200365. doi: 10.1002/cmdc.202200365. Epub 2022 Oct 5.

LogD7.4 prediction enhanced by transferring knowledge from chromatographic retention time, microscopic pKa and logP.通过从色谱保留时间、微观pKa和logP转移知识增强LogD7.4预测

J Cheminform. 2023 Sep 5;15(1):76. doi: 10.1186/s13321-023-00754-4.

Development and test of highly accurate endpoint free energy methods. 2: Prediction of logarithm of n-octanol-water partition coefficient (logP) for druglike molecules using MM-PBSA method.高精度无末端自由能方法的开发和测试。2：使用 MM-PBSA 方法预测类药性分子的正辛醇-水分配系数（logP）的对数。

J Comput Chem. 2023 May 15;44(13):1300-1311. doi: 10.1002/jcc.27086. Epub 2023 Feb 23.

Effect of gem-Difluorination on the Key Physicochemical Properties Relevant to Medicinal Chemistry: The Case of Functionalized Cycloalkanes.Gem-二氟化物对与药物化学相关的关键物理化学性质的影响：以功能化环烷烃为例。

Chemistry. 2022 Apr 1;28(19):e202200331. doi: 10.1002/chem.202200331. Epub 2022 Mar 4.

A comparison of molecular representations for lipophilicity quantitative structure-property relationships with results from the SAMPL6 logP Prediction Challenge.亲脂性定量构效关系的分子描述符比较与 SAMPL6 logP 预测挑战的结果。

J Comput Aided Mol Des. 2020 May;34(5):523-534. doi: 10.1007/s10822-020-00279-0. Epub 2020 Jan 13.

Comparison of logP and logD correction models trained with public and proprietary data sets.比较使用公共数据集和专有数据集训练的 logP 和 logD 校正模型。

J Comput Aided Mol Des. 2022 Mar;36(3):253-262. doi: 10.1007/s10822-022-00450-9. Epub 2022 Apr 1.

引用本文的文献

Refined ADME Profiles for ATC Drug Classes.ATC药物分类的精准药代动力学/药物代谢特征

Pharmaceutics. 2025 Feb 28;17(3):308. doi: 10.3390/pharmaceutics17030308.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用机器学习填补饱和含氟衍生物的[此处原文缺失相关内容]及评估方面的空白。

Filling the Gap in and Evaluation for Saturated Fluorine-Containing Derivatives With Machine Learning.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献