使用人工智能/机器学习模型预测胶体分子聚集

Predictions of Colloidal Molecular Aggregation Using AI/ML Models.

作者信息

Kombo David C, Stepp J David, Lim Sungtaek, Elshorst Bettina, Li Yi, Cato Laura, Shomali Maysoun, Fink David, LaMarche Matthew J

机构信息

Integrated Drug Discovery, Sanofi, 350 Water St., Cambridge, Massachusetts 02141, United States.

CMC Synthetics Early Development Analytics, Sanofi, Industriepark Hochst, Frankfurt 65926, Germany.

出版信息

ACS Omega. 2024 Jun 18;9(26):28691-28706. doi: 10.1021/acsomega.4c02886. eCollection 2024 Jul 2.

DOI:10.1021/acsomega.4c02886

PMID:38973835

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11223200/

Abstract

To facilitate the triage of hits from small molecule screens, we have used various AI/ML techniques and experimentally observed data sets to build models aimed at predicting colloidal aggregation of small organic molecules in aqueous solution. We have found that Naïve Bayesian and deep neural networks outperform logistic regression, recursive partitioning tree, support vector machine, and random forest techniques by having the lowest balanced error rate (BER) for the test set. Derived predictive classification models consistently and successfully discriminated aggregator molecules from nonaggregator hits. An analysis of molecular descriptors in favor of colloidal aggregation confirms previous observations (hydrophobicity, molecular weight, and solubility) in addition to undescribed molecular descriptors such as the fraction of sp carbon atoms (Fsp3), and electrotopological state of hydroxyl groups (ES_Sum_sOH). Naïve Bayesian modeling and scaffold tree analysis have revealed chemical features/scaffolds contributing the most to colloidal aggregation and nonaggregation, respectively. These results highlight the importance of scaffolds with high Fsp3 values in promoting nonaggregation. Matched molecular pair analysis (MMPA) has also deciphered context-dependent substitutions, which can be used to design nonaggregator molecules. We found that most matched molecular pairs have a neutral effect on aggregation propensity. We have prospectively applied our predictive models to assist in chemical library triage for optimal plate selection diversity and purchase for high throughput screening (HTS) in drug discovery projects.

摘要

为便于对小分子筛选得到的命中化合物进行分类，我们运用了各种人工智能/机器学习技术以及实验观测数据集来构建模型，旨在预测小有机分子在水溶液中的胶体聚集情况。我们发现，朴素贝叶斯和深度神经网络在测试集中具有最低的平衡错误率（BER），优于逻辑回归、递归划分树、支持向量机和随机森林技术。推导得到的预测分类模型始终且成功地将聚集分子与非聚集命中化合物区分开来。对有利于胶体聚集的分子描述符的分析证实了先前的观察结果（疏水性、分子量和溶解度），此外还发现了一些未描述的分子描述符，如sp碳原子分数（Fsp3）和羟基的电子拓扑状态（ES_Sum_sOH）。朴素贝叶斯建模和支架树分析分别揭示了对胶体聚集和非聚集贡献最大的化学特征/支架。这些结果突出了具有高Fsp3值的支架在促进非聚集方面的重要性。匹配分子对分析（MMPA）也解读了上下文相关的取代情况，可用于设计非聚集分子。我们发现大多数匹配分子对在聚集倾向方面具有中性作用。我们已前瞻性地应用我们的预测模型，以协助化学文库分类，实现最佳的板选择多样性，并为药物发现项目中的高通量筛选（HTS）进行采购。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c483/11223200/b118690758bf/ao4c02886_0001.jpg

相似文献

Predictions of Colloidal Molecular Aggregation Using AI/ML Models.使用人工智能/机器学习模型预测胶体分子聚集

ACS Omega. 2024 Jun 18;9(26):28691-28706. doi: 10.1021/acsomega.4c02886. eCollection 2024 Jul 2.

Artificial intelligence in clinical care amidst COVID-19 pandemic: A systematic review.COVID-19大流行期间临床护理中的人工智能：一项系统综述。

Comput Struct Biotechnol J. 2021;19:2833-2850. doi: 10.1016/j.csbj.2021.05.010. Epub 2021 May 7.

Boosting the Accuracy and Chemical Space Coverage of the Detection of Small Colloidal Aggregating Molecules Using the BAD Molecule Filter.使用 BAD 分子筛选器提高小胶体聚集分子检测的准确性和化学空间覆盖率。

J Chem Inf Model. 2024 Jul 8;64(13):4991-5005. doi: 10.1021/acs.jcim.4c00363. Epub 2024 Jun 26.

Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?预测模型工具能否识别 ACL 重建术后阿片类药物使用时间延长的高风险患者？

Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251.

Predictions of BuChE inhibitors using support vector machine and naive Bayesian classification techniques in drug discovery.使用支持向量机和朴素贝叶斯分类技术在药物发现中预测丁酰胆碱酯酶抑制剂。

J Chem Inf Model. 2013 Nov 25;53(11):3009-20. doi: 10.1021/ci400331p. Epub 2013 Nov 6.

Causal Artificial Intelligence Models of Food Quality Data.食品质量数据的因果人工智能模型。

Food Technol Biotechnol. 2024 Mar;62(1):102-109. doi: 10.17113/ftb.62.01.24.8301.

On the Relationship between Molecular Hit Rates in High-Throughput Screening and Molecular Descriptors.高通量筛选中分子命中率与分子描述符之间的关系

J Biomol Screen. 2014 Jun;19(5):727-37. doi: 10.1177/1087057113499631. Epub 2013 Aug 23.

A Hybrid Docking and Machine Learning Approach to Enhance the Performance of Virtual Screening Carried out on Protein-Protein Interfaces.一种混合对接和机器学习方法，可增强在蛋白质-蛋白质界面上进行的虚拟筛选的性能。

Int J Mol Sci. 2022 Nov 18;23(22):14364. doi: 10.3390/ijms232214364.

Quantitative structure-activity relationship models of chemical transformations from matched pairs analyses.从匹配对分析看化学转化的定量构效关系模型。

J Chem Inf Model. 2014 Apr 28;54(4):1226-34. doi: 10.1021/ci500012n. Epub 2014 Mar 31.

Comparing Multiple Machine Learning Algorithms and Metrics for Estrogen Receptor Binding Prediction.比较多种机器学习算法和指标进行雌激素受体结合预测。

Mol Pharm. 2018 Oct 1;15(10):4361-4370. doi: 10.1021/acs.molpharmaceut.8b00546. Epub 2018 Aug 28.

引用本文的文献

Discovery and optimization of a guanylhydrazone-based small molecule to replace bFGF for cell culture applications.基于脒腙的小分子替代碱性成纤维细胞生长因子用于细胞培养应用的发现与优化。

Biochem Biophys Rep. 2025 Jul 21;43:102167. doi: 10.1016/j.bbrep.2025.102167. eCollection 2025 Sep.

Physics-Based Solubility Prediction for Organic Molecules.基于物理的有机分子溶解度预测

Chem Rev. 2025 Aug 13;125(15):7057-7098. doi: 10.1021/acs.chemrev.4c00855. Epub 2025 Jul 29.

本文引用的文献

Fragment-based drug nanoaggregation reveals drivers of self-assembly.基于片段的药物纳米聚集揭示了自组装的驱动力。

Nat Commun. 2023 Dec 14;14(1):8340. doi: 10.1038/s41467-023-43560-0.

Integrating QSAR modelling and deep learning in drug discovery: the emergence of deep QSAR.将 QSAR 建模与深度学习整合到药物发现中：深 QSAR 的出现。

Nat Rev Drug Discov. 2024 Feb;23(2):141-155. doi: 10.1038/s41573-023-00832-0. Epub 2023 Dec 8.

Colloidal Aggregators in Biochemical SARS-CoV-2 Repurposing Screens.生物化学 SARS-CoV-2 再利用筛选中的胶态聚集物。

J Med Chem. 2021 Dec 9;64(23):17530-17539. doi: 10.1021/acs.jmedchem.1c01547. Epub 2021 Nov 23.

Deep Neural Networks for QSAR.深度学习方法在定量构效关系中的应用。

Methods Mol Biol. 2022;2390:233-260. doi: 10.1007/978-1-0716-1787-8_10.

Emergent synthetic methods for the modular advancement of sp-rich fragments.用于富sp片段模块化推进的紧急合成方法。

Chem Sci. 2021 Mar 2;12(13):4646-4660. doi: 10.1039/d1sc00161b.

Drug Repurposing for Prevention and Treatment of COVID-19: A Clinical Landscape.用于预防和治疗新型冠状病毒肺炎的药物重新利用：临床概况

Discoveries (Craiova). 2020 Dec 16;8(4):e121. doi: 10.15190/d.2020.18.

A review on drug repurposing applicable to COVID-19.关于药物再利用适用于 COVID-19 的综述。

Brief Bioinform. 2021 Mar 22;22(2):726-741. doi: 10.1093/bib/bbaa288.

Drug repurposing approach to fight COVID-19.药物重定位方法抗击 COVID-19。

Pharmacol Rep. 2020 Dec;72(6):1479-1508. doi: 10.1007/s43440-020-00155-6. Epub 2020 Sep 5.

Fsp: A new parameter for drug-likeness.Fsp：一个新的类药性参数。

Drug Discov Today. 2020 Oct;25(10):1839-1845. doi: 10.1016/j.drudis.2020.07.017. Epub 2020 Jul 24.

Drug targets for COVID-19 therapeutics: Ongoing global efforts.抗新冠病毒药物靶点：全球努力持续进行中。

J Biosci. 2020;45(1). doi: 10.1007/s12038-020-00067-w.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用人工智能/机器学习模型预测胶体分子聚集

Predictions of Colloidal Molecular Aggregation Using AI/ML Models.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献