Suppr超能文献

IDR解码器:一种针对内在无序区域进行合理药物发现的机器学习方法。

IDRdecoder: a machine learning approach for rational drug discovery toward intrinsically disordered regions.

作者信息

Shionyu-Mitusyama Clara, Ohmori Satoshi, Hirata Subaru, Ishida Hirokazu, Shirai Tsuyoshi

机构信息

Department of Bioscience, Nagahama Institute of Bio-Science and Technology, Nagahama, Shiga, Japan.

Faculty of Data Science, Shiga University 1-1-1 Banba, Hikone, Shiga, Japan.

出版信息

Front Bioinform. 2025 Jul 18;5:1627836. doi: 10.3389/fbinf.2025.1627836. eCollection 2025.

Abstract

INTRODUCTION

Intrinsically disordered regions (IDRs) of proteins have traditionally been overlooked as drug targets. However, with growing recognition of their crucial role in biological activity and their involvement in various diseases, IDRs have emerged as promising targets for drug discovery. Despite this potential, rational methodologies for IDR-targeted drug discovery remain underdeveloped, primarily due to a lack of reference experimental data.

METHODS

This study explores a machine learning approach to predict IDR functions, drug interaction sites, and interacting molecular substructures within IDR sequences. To address the data gap, stepwise transfer learning was employed. IDRdecoder sequentially generate predictions for IDR classification, interaction sites, and interacting ligand substructures. In the first step, the neural net was trained as autoencoder by using 26,480,862 predicted IDR sequences. Then it was trained against 57,692 ligand-binding PDB sequences with higher IDR tendency via transfer learning for predict ligand interacting sites and ligand types.

RESULTS

IDRdecoder was evaluated against 9 IDR sequences, which were experimentally detailed as drug targets. In the encoding space, specific GO terms related to the hypothesized functions of the evaluation IDR sequences were highly enriched. The model's prediction performance for drug interacting sites and ligand types demonstrated the area under the curve (AUC) of 0.616 and 0.702, respectively. The performance was compared with existing methods including ProteinBERT, and IDRdecoder demonstrated moderately improved performance.

DISCUSSION

IDRdecoder is the first application for predicting drug interaction sites and ligands in IDR sequences. Analysis of the prediction results revealed characteristics beneficial for IDR-drug design; for instance, Tyr and Ala are preferred target sites, while flexible substructures, such as alkyl groups, are favored in ligand molecules.

摘要

引言

蛋白质的内在无序区域(IDR)传统上一直被忽视作为药物靶点。然而,随着人们越来越认识到它们在生物活性中的关键作用以及它们与各种疾病的关联,IDR已成为药物发现的有希望的靶点。尽管有这种潜力,但针对IDR的药物发现的合理方法仍然不够发达,主要是由于缺乏参考实验数据。

方法

本研究探索了一种机器学习方法来预测IDR功能、药物相互作用位点以及IDR序列内的相互作用分子亚结构。为了解决数据缺口,采用了逐步迁移学习。IDRdecoder依次对IDR分类、相互作用位点和相互作用配体亚结构进行预测。在第一步中,通过使用26,480,862个预测的IDR序列将神经网络训练为自动编码器。然后通过迁移学习针对具有更高IDR倾向的57,692个配体结合PDB序列对其进行训练,以预测配体相互作用位点和配体类型。

结果

针对9个作为药物靶点进行了实验详细研究的IDR序列对IDRdecoder进行了评估。在编码空间中,与评估IDR序列的假设功能相关的特定基因本体(GO)术语高度富集。该模型对药物相互作用位点和配体类型的预测性能分别显示曲线下面积(AUC)为0.616和0.702。将该性能与包括ProteinBERT在内的现有方法进行了比较,IDRdecoder表现出适度的性能提升。

讨论

IDRdecoder是预测IDR序列中药物相互作用位点和配体的首个应用。对预测结果的分析揭示了对IDR药物设计有益的特征;例如,酪氨酸(Tyr)和丙氨酸(Ala)是优选的靶点位点,而配体分子中倾向于柔性亚结构,如烷基。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/49ec/12313641/edad682686ab/fbinf-05-1627836-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验