• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过从色谱保留时间、微观pKa和logP转移知识增强LogD7.4预测

LogD7.4 prediction enhanced by transferring knowledge from chromatographic retention time, microscopic pKa and logP.

作者信息

Wang Yitian, Xiong Jiacheng, Xiao Fu, Zhang Wei, Cheng Kaiyang, Rao Jingxin, Niu Buying, Tong Xiaochu, Qu Ning, Zhang Runze, Wang Dingyan, Chen Kaixian, Li Xutong, Zheng Mingyue

机构信息

Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China.

University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China.

出版信息

J Cheminform. 2023 Sep 5;15(1):76. doi: 10.1186/s13321-023-00754-4.

DOI:10.1186/s13321-023-00754-4
PMID:37670374
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10478446/
Abstract

Lipophilicity is a fundamental physical property that significantly affects various aspects of drug behavior, including solubility, permeability, metabolism, distribution, protein binding, and toxicity. Accurate prediction of lipophilicity, measured by the logD7.4 value (the distribution coefficient between n-octanol and buffer at physiological pH 7.4), is crucial for successful drug discovery and design. However, the limited availability of data for logD modeling poses a significant challenge to achieving satisfactory generalization capability. To address this challenge, we have developed a novel logD7.4 prediction model called RTlogD, which leverages knowledge from multiple sources. RTlogD combines pre-training on a chromatographic retention time (RT) dataset since the RT is influenced by lipophilicity. Additionally, microscopic pKa values are incorporated as atomic features, providing valuable insights into ionizable sites and ionization capacity. Furthermore, logP is integrated as an auxiliary task within a multitask learning framework. We conducted ablation studies and presented a detailed analysis, showcasing the effectiveness and interpretability of RT, pKa, and logP in the RTlogD model. Notably, our RTlogD model demonstrated superior performance compared to commonly used algorithms and prediction tools. These results underscore the potential of the RTlogD model to improve the accuracy and generalization of logD prediction in drug discovery and design. In summary, the RTlogD model addresses the challenge of limited data availability in logD modeling by leveraging knowledge from RT, microscopic pKa, and logP. Incorporating these factors enhances the predictive capabilities of our model, and it holds promise for real-world applications in drug discovery and design scenarios.

摘要

亲脂性是一种基本物理性质,对药物行为的各个方面都有显著影响,包括溶解度、渗透性、代谢、分布、蛋白质结合和毒性。通过logD7.4值(正辛醇与生理pH值7.4的缓冲液之间的分配系数)来衡量,准确预测亲脂性对于成功的药物发现和设计至关重要。然而,用于logD建模的数据有限,这对实现令人满意的泛化能力构成了重大挑战。为应对这一挑战,我们开发了一种名为RTlogD的新型logD7.4预测模型,该模型利用了来自多个来源的知识。RTlogD结合了对色谱保留时间(RT)数据集的预训练,因为RT受亲脂性影响。此外,微观pKa值作为原子特征被纳入,为可电离位点和电离能力提供了有价值的见解。此外,logP被整合为多任务学习框架中的一项辅助任务。我们进行了消融研究并给出了详细分析,展示了RT、pKa和logP在RTlogD模型中的有效性和可解释性。值得注意的是,我们的RTlogD模型与常用算法和预测工具相比表现出卓越的性能。这些结果强调了RTlogD模型在提高药物发现和设计中logD预测的准确性和泛化能力方面的潜力。总之,RTlogD模型通过利用来自RT、微观pKa和logP的知识,解决了logD建模中数据可用性有限的挑战。纳入这些因素增强了我们模型的预测能力,并且它在药物发现和设计场景的实际应用中具有前景。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aaa1/10478446/18a28affc8e1/13321_2023_754_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aaa1/10478446/49ee2b111a3c/13321_2023_754_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aaa1/10478446/9ca367f7e74e/13321_2023_754_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aaa1/10478446/ce739b7ccfb8/13321_2023_754_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aaa1/10478446/a2dbf41de2b3/13321_2023_754_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aaa1/10478446/18a28affc8e1/13321_2023_754_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aaa1/10478446/49ee2b111a3c/13321_2023_754_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aaa1/10478446/9ca367f7e74e/13321_2023_754_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aaa1/10478446/ce739b7ccfb8/13321_2023_754_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aaa1/10478446/a2dbf41de2b3/13321_2023_754_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aaa1/10478446/18a28affc8e1/13321_2023_754_Fig5_HTML.jpg

相似文献

1
LogD7.4 prediction enhanced by transferring knowledge from chromatographic retention time, microscopic pKa and logP.通过从色谱保留时间、微观pKa和logP转移知识增强LogD7.4预测
J Cheminform. 2023 Sep 5;15(1):76. doi: 10.1186/s13321-023-00754-4.
2
Improved GNNs for Log  Prediction by Transferring Knowledge from Low-Fidelity Data.通过从低质量数据转移知识来改进图神经网络进行日志预测。
J Chem Inf Model. 2023 Apr 24;63(8):2345-2359. doi: 10.1021/acs.jcim.2c01564. Epub 2023 Mar 31.
3
Comparison of logP and logD correction models trained with public and proprietary data sets.比较使用公共数据集和专有数据集训练的 logP 和 logD 校正模型。
J Comput Aided Mol Des. 2022 Mar;36(3):253-262. doi: 10.1007/s10822-022-00450-9. Epub 2022 Apr 1.
4
Systematic Modeling of log  Based on Ensemble Machine Learning, Group Contribution, and Matched Molecular Pair Analysis.基于集成机器学习、基团贡献和匹配分子对分析的对数系统建模。
J Chem Inf Model. 2020 Jan 27;60(1):63-76. doi: 10.1021/acs.jcim.9b00718. Epub 2020 Jan 10.
5
Application of ALOGPS to predict 1-octanol/water distribution coefficients, logP, and logD, of AstraZeneca in-house database.应用ALOGPS预测阿斯利康内部数据库中化合物的1-辛醇/水分配系数、logP和logD。
J Pharm Sci. 2004 Dec;93(12):3103-10. doi: 10.1002/jps.20217.
6
Determination of reversed-phase high performance liquid chromatography based octanol-water partition coefficients for neutral and ionizable compounds: Methodology evaluation.基于反相高效液相色谱法测定中性和可电离化合物的正辛醇-水分配系数:方法学评估
J Chromatogr A. 2017 Dec 15;1528:25-34. doi: 10.1016/j.chroma.2017.10.064. Epub 2017 Oct 27.
7
A deep learning approach for the blind logP prediction in SAMPL6 challenge.一种用于 SAMPL6 挑战赛中盲 logP 预测的深度学习方法。
J Comput Aided Mol Des. 2020 May;34(5):535-542. doi: 10.1007/s10822-020-00292-3. Epub 2020 Jan 30.
8
Multitask machine learning models for predicting lipophilicity (logP) in the SAMPL7 challenge.多任务机器学习模型在 SAMPL7 挑战赛中预测脂水分配系数(logP)。
J Comput Aided Mol Des. 2021 Aug;35(8):901-909. doi: 10.1007/s10822-021-00405-6. Epub 2021 Jul 17.
9
ALipSol: An Attention-Driven Mixture-of-Experts Model for Lipophilicity and Solubility Prediction.ALipSol:一种用于亲脂性和溶解度预测的注意力驱动专家混合模型。
J Chem Inf Model. 2022 Dec 12;62(23):5975-5987. doi: 10.1021/acs.jcim.2c01290. Epub 2022 Nov 23.
10
Development of a simple proton nuclear magnetic resonance-based procedure to estimate the approximate distribution coefficient at physiological pH (logD): Evaluation and comparison to existing practices.开发一种基于简单质子核磁共振的程序以估算生理pH值下的近似分配系数(logD):与现有方法的评估和比较。
Bioorg Med Chem Lett. 2017 Jan 15;27(2):319-322. doi: 10.1016/j.bmcl.2016.11.048. Epub 2016 Nov 18.

引用本文的文献

1
Optimizing blood-brain barrier permeability in KRAS inhibitors: A structure-constrained molecular generation approach.优化KRAS抑制剂的血脑屏障通透性:一种结构受限的分子生成方法。
J Pharm Anal. 2025 Aug;15(8):101337. doi: 10.1016/j.jpha.2025.101337. Epub 2025 May 9.
2
Machine Learning for Toxicity Prediction Using Chemical Structures: Pillars for Success in the Real World.利用化学结构进行毒性预测的机器学习:在现实世界中取得成功的支柱。
Chem Res Toxicol. 2025 May 19;38(5):759-807. doi: 10.1021/acs.chemrestox.5c00033. Epub 2025 May 2.
3
Predicting Distribution Coefficients (LogD) of Cyclic Peptides Using Molecular Dynamics Simulations.

本文引用的文献

1
Improved GNNs for Log  Prediction by Transferring Knowledge from Low-Fidelity Data.通过从低质量数据转移知识来改进图神经网络进行日志预测。
J Chem Inf Model. 2023 Apr 24;63(8):2345-2359. doi: 10.1021/acs.jcim.2c01564. Epub 2023 Mar 31.
2
Using Machine Learning To Predict Partition Coefficient (Log ) and Distribution Coefficient (Log ) with Molecular Descriptors and Liquid Chromatography Retention Time.利用机器学习通过分子描述符和液相色谱保留时间预测分配系数(log )和分布系数(log )
J Chem Inf Model. 2023 Apr 10;63(7):1906-1913. doi: 10.1021/acs.jcim.2c01373. Epub 2023 Mar 16.
3
ALipSol: An Attention-Driven Mixture-of-Experts Model for Lipophilicity and Solubility Prediction.
使用分子动力学模拟预测环肽的分配系数(LogD)
Pharm Res. 2025 Apr;42(4):613-622. doi: 10.1007/s11095-025-03850-2. Epub 2025 Mar 26.
ALipSol:一种用于亲脂性和溶解度预测的注意力驱动专家混合模型。
J Chem Inf Model. 2022 Dec 12;62(23):5975-5987. doi: 10.1021/acs.jcim.2c01290. Epub 2022 Nov 23.
4
Comparison of logP and logD correction models trained with public and proprietary data sets.比较使用公共数据集和专有数据集训练的 logP 和 logD 校正模型。
J Comput Aided Mol Des. 2022 Mar;36(3):253-262. doi: 10.1007/s10822-022-00450-9. Epub 2022 Apr 1.
5
Predicting reversed-phase liquid chromatographic retention times of pesticides by deep neural networks.利用深度神经网络预测农药的反相液相色谱保留时间
Heliyon. 2021 Dec 7;7(12):e08563. doi: 10.1016/j.heliyon.2021.e08563. eCollection 2021 Dec.
6
Knowledge-Embedded Message-Passing Neural Networks: Improving Molecular Property Prediction with Human Knowledge.知识嵌入消息传递神经网络:利用人类知识改进分子性质预测
ACS Omega. 2021 Oct 14;6(42):27955-27967. doi: 10.1021/acsomega.1c03839. eCollection 2021 Oct 26.
7
DGL-LifeSci: An Open-Source Toolkit for Deep Learning on Graphs in Life Science.DGL-LifeSci:用于生命科学领域图深度学习的开源工具包。
ACS Omega. 2021 Oct 5;6(41):27233-27238. doi: 10.1021/acsomega.1c04017. eCollection 2021 Oct 19.
8
Improved Lipophilicity and Aqueous Solubility Prediction with Composite Graph Neural Networks.复合图神经网络提高亲脂性和水溶解度预测。
Molecules. 2021 Oct 13;26(20):6185. doi: 10.3390/molecules26206185.
9
Multi-instance learning of graph neural networks for aqueous pKa prediction.用于预测水中pKa的图神经网络多实例学习
Bioinformatics. 2022 Jan 12;38(3):792-798. doi: 10.1093/bioinformatics/btab714.
10
FP-ADMET: a compendium of fingerprint-based ADMET prediction models.FP-ADMET:基于指纹的ADMET预测模型汇编
J Cheminform. 2021 Sep 28;13(1):75. doi: 10.1186/s13321-021-00557-5.