Suppr超能文献

利用机器学习辅助的基于计算片段的设计发现新型高血压小分子的数据集。

Dataset for discovering new hypertension small molecules using machine learning-aided computational fragment-based design.

作者信息

Lehasa Odifentse Mapula-E, Chude-Okonkwo Uche A K

机构信息

Institute for Intelligent Systems, University of Johannesburg, 69 Kingsway Avenue, Auckland Park, Johannesburg 2092, Gauteng Province, South Africa.

出版信息

Data Brief. 2024 Jun 26;55:110677. doi: 10.1016/j.dib.2024.110677. eCollection 2024 Aug.

Abstract

This dataset demonstrates the use of computational fragmentation-based and machine learning-aided drug discovery to generate new lead molecules for the treatment of hypertension. Specifically, the focus is on agents targeting the renin-angiotensin-aldosterone system (RAAS), commonly classified as Angiotensin-Converting Enzyme Inhibitors (ACEIs) and Angiotensin II Receptor Blockers (ARBs). The preliminary dataset was a target-specific, user-generated fragment library of 63 molecular fragments of the 26 approved ACEI and ARB molecules obtained from the ChEMBL and DrugBank molecular databases. This fragment library provided the primary input dataset to generate the new lead molecules presented in the dataset. The newly generated molecules were screened to check whether they met the criteria for oral drugs and comprised the ACEI or ARB core functional group criterion. Using unsupervised machine learning, the molecules that met the criterion were divided into clusters of drug classes based on their functional group allocation. This process led to three final output datasets, one containing the new ACEI molecules, another for the new ARB molecules, and the last for the new unassigned class molecules. This data can aid in the timely and efficient design of novel antihypertensive drugs. It can also be used in precision hypertension medicine for patients with treatment resistance, non-response or co-morbidities. Although this dataset is specific to antihypertensive agents, the model can be reused with minimal changes to produce new lead molecules for other health conditions.

摘要

该数据集展示了基于计算片段化和机器学习辅助的药物发现方法,用于生成治疗高血压的新先导分子。具体而言,重点是针对肾素-血管紧张素-醛固酮系统(RAAS)的药物,通常分为血管紧张素转换酶抑制剂(ACEIs)和血管紧张素II受体阻滞剂(ARBs)。初步数据集是一个特定于靶点的、用户生成的片段库,包含从ChEMBL和DrugBank分子数据库中获取的26种已批准的ACEI和ARB分子的63个分子片段。这个片段库为生成数据集中呈现的新先导分子提供了主要输入数据集。对新生成的分子进行筛选,以检查它们是否符合口服药物的标准,并包含ACEI或ARB核心官能团标准。使用无监督机器学习,符合标准的分子根据其官能团分配被分为不同的药物类别簇。这个过程产生了三个最终输出数据集,一个包含新的ACEI分子,另一个包含新的ARB分子,最后一个包含新的未分类分子。这些数据有助于及时、高效地设计新型抗高血压药物。它还可用于对治疗耐药、无反应或有合并症的高血压患者的精准治疗。尽管这个数据集特定于抗高血压药物,但该模型只需进行最小的更改就可以重新使用,以生成针对其他健康状况的新先导分子。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78c8/11282967/dec0dbc9e103/gr1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验