一种基于深度学习的化合物-蛋白质相互作用通用预测模型。

A general prediction model for compound-protein interactions based on deep learning.

作者信息

Ji Wei, She Shengnan, Qiao Chunxue, Feng Qiuqi, Rui Mengjie, Xu Ximing, Feng Chunlai

机构信息

School of Pharmacy, Jiangsu University, Zhenjiang, China.

School of Medicine, Jiangsu University, Zhenjiang, China.

出版信息

Front Pharmacol. 2024 Sep 4;15:1465890. doi: 10.3389/fphar.2024.1465890. eCollection 2024.

DOI:10.3389/fphar.2024.1465890

PMID:39295942

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11408283/

Abstract

BACKGROUND

The identification of compound-protein interactions (CPIs) is crucial for drug discovery and understanding mechanisms of action. Accurate CPI prediction can elucidate drug-target-disease interactions, aiding in the discovery of candidate compounds and effective synergistic drugs, particularly from traditional Chinese medicine (TCM). Existing methods face challenges in prediction accuracy and generalization due to compound and target diversity and the lack of largescale interaction datasets and negative datasets for model learning.

METHODS

To address these issues, we developed a computational model for CPI prediction by integrating the constructed large-scale bioactivity benchmark dataset with a deep learning (DL) algorithm. To verify the accuracy of our CPI model, we applied it to predict the targets of compounds in TCM. An herb pair of and was used as a model, and the active compounds in this herb pair were collected from various public databases and the literature. The complete targets of these active compounds were predicted by the CPI model, resulting in an expanded target dataset. This dataset was next used for the prediction of synergistic antitumor compound combinations. The predicted multi-compound combinations were subsequently examined through cellular experiments.

RESULTS

Our CPI model demonstrated superior performance over other machine learning models, achieving an area under the Receiver Operating Characteristic curve (AUROC) of 0.98, an area under the precision-recall curve (AUPR) of 0.98, and an accuracy (ACC) of 93.31% on the test set. The model's generalization capability and applicability were further confirmed using external databases. Utilizing this model, we predicted the targets of compounds in the herb pair of Astragalus membranaceus and Hedyotis diffusaas, yielding an expanded target dataset. Then, we integrated this expanded target dataset to predict effective drug combinations using our drug synergy prediction model DeepMDS. Experimental assay on breast cancer cell line MDA-MB-231 proved the efficacy of the best predicted multi-compound combinations: Combination I (Epicatechin, Ursolic acid, Quercetin, Aesculetin and Astragaloside IV) exhibited a half-maximal inhibitory concentration (IC) value of 19.41 μM, and a combination index (CI) value of 0.682; and Combination II (Epicatechin, Ursolic acid, Quercetin, Vanillic acid and Astragaloside IV) displayed a IC value of 23.83 μM and a CI value of 0.805. These results validated the ability of our model to make accurate predictions for novel CPI data outside the training dataset and evaluated the reliability of the predictions, showing good applicability potential in drug discovery and in the elucidation of the bioactive compounds in TCM.

CONCLUSION

Our CPI prediction model can serve as a useful tool for accurately identifying potential CPI for a wide range of proteins, and is expected to facilitate drug research, repurposing and support the understanding of TCM.

摘要

背景

化合物 - 蛋白质相互作用（CPI）的识别对于药物发现和作用机制的理解至关重要。准确的CPI预测可以阐明药物 - 靶点 - 疾病之间的相互作用，有助于发现候选化合物和有效的协同药物，特别是来自中药（TCM）的药物。由于化合物和靶点的多样性以及缺乏用于模型学习的大规模相互作用数据集和阴性数据集，现有方法在预测准确性和泛化性方面面临挑战。

方法

为了解决这些问题，我们通过将构建的大规模生物活性基准数据集与深度学习（DL）算法相结合，开发了一种用于CPI预测的计算模型。为了验证我们的CPI模型的准确性，我们将其应用于预测中药中化合物的靶点。以黄芪和白花蛇舌草这一药对为模型，从各种公共数据库和文献中收集了该药对中的活性化合物。通过CPI模型预测这些活性化合物的完整靶点，从而得到一个扩展的靶点数据集。接下来，使用这个数据集预测协同抗肿瘤化合物组合。随后通过细胞实验对预测的多化合物组合进行检测。

结果

我们的CPI模型在性能上优于其他机器学习模型，在测试集上的受试者操作特征曲线下面积（AUROC）为0.98，精确召回率曲线下面积（AUPR）为0.98，准确率（ACC）为93.31%。使用外部数据库进一步证实了该模型的泛化能力和适用性。利用该模型，我们预测了黄芪和白花蛇舌草药对中化合物的靶点，得到了一个扩展的靶点数据集。然后，我们使用我们的药物协同预测模型DeepMDS整合这个扩展的靶点数据集来预测有效的药物组合。对乳腺癌细胞系MDA - MB - 231进行的实验分析证明了最佳预测的多化合物组合的有效性：组合I（表儿茶素、熊果酸、槲皮素、秦皮乙素和黄芪甲苷IV）的半数最大抑制浓度（IC）值为19.41 μM，组合指数（CI）值为0.682；组合II（表儿茶素、熊果酸、槲皮素、香草酸和黄芪甲苷IV）的IC值为23.83 μM，CI值为0.805。这些结果验证了我们的模型对训练数据集之外的新型CPI数据进行准确预测的能力，并评估了预测的可靠性，显示出在药物发现和阐明中药生物活性化合物方面具有良好的应用潜力。

结论

我们的CPI预测模型可以作为一种有用的工具，用于准确识别广泛蛋白质的潜在CPI，并有望促进药物研究、药物再利用以及支持对中药的理解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e65/11408283/c262f4ddb3bc/fphar-15-1465890-g001.jpg

相似文献

A general prediction model for compound-protein interactions based on deep learning.一种基于深度学习的化合物-蛋白质相互作用通用预测模型。

Front Pharmacol. 2024 Sep 4;15:1465890. doi: 10.3389/fphar.2024.1465890. eCollection 2024.

Development of simultaneous interaction prediction approach (SiPA) for the expansion of interaction network of traditional Chinese medicine.用于扩展中药相互作用网络的同步相互作用预测方法（SiPA）的开发

Chin Med. 2020 Aug 26;15:90. doi: 10.1186/s13020-020-00369-z. eCollection 2020.

A deep learning method for predicting molecular properties and compound-protein interactions.一种用于预测分子性质和化合物-蛋白质相互作用的深度学习方法。

J Mol Graph Model. 2022 Dec;117:108283. doi: 10.1016/j.jmgm.2022.108283. Epub 2022 Aug 17.

Deep learning-based multi-drug synergy prediction model for individually tailored anti-cancer therapies.基于深度学习的多药协同预测模型，用于个性化定制抗癌疗法。

Front Pharmacol. 2022 Dec 15;13:1032875. doi: 10.3389/fphar.2022.1032875. eCollection 2022.

MMCL-CPI: A multi-modal compound-protein interaction prediction model incorporating contrastive learning pre-training.MMCL-CPI：一种结合对比学习预训练的多模态化合物-蛋白质相互作用预测模型。

Comput Biol Chem. 2024 Oct;112:108137. doi: 10.1016/j.compbiolchem.2024.108137. Epub 2024 Jul 25.

Improving compound-protein interaction prediction by building up highly credible negative samples.通过构建高度可信的负样本改进化合物-蛋白质相互作用预测。

Bioinformatics. 2015 Jun 15;31(12):i221-9. doi: 10.1093/bioinformatics/btv256.

Effectively Identifying Compound-Protein Interactions by Learning from Positive and Unlabeled Examples.通过从正例和无标签样例中学习来有效识别化合物-蛋白质相互作用。

IEEE/ACM Trans Comput Biol Bioinform. 2018 Nov-Dec;15(6):1832-1843. doi: 10.1109/TCBB.2016.2570211. Epub 2016 May 18.

Prediction model for synergistic anti-tumor multi-compound combinations from traditional Chinese medicine based on extreme gradient boosting, targets and gene expression data.基于极端梯度提升、靶点和基因表达数据的中药协同抗肿瘤多化合物组合预测模型。

J Bioinform Comput Biol. 2022 Jun;20(3):2250016. doi: 10.1142/S0219720022500160.

A bidirectional interpretable compound-protein interaction prediction framework based on cross attention.基于交叉注意力的双向可解释化合物-蛋白质相互作用预测框架。

Comput Biol Med. 2024 Apr;172:108239. doi: 10.1016/j.compbiomed.2024.108239. Epub 2024 Mar 2.

Protein domain-based prediction of drug/compound-target interactions and experimental validation on LIM kinases.基于蛋白质结构域的药物/化合物-靶标相互作用预测及在 LIM 激酶上的实验验证。

PLoS Comput Biol. 2021 Nov 29;17(11):e1009171. doi: 10.1371/journal.pcbi.1009171. eCollection 2021 Nov.

本文引用的文献

Deep learning-based multi-drug synergy prediction model for individually tailored anti-cancer therapies.基于深度学习的多药协同预测模型，用于个性化定制抗癌疗法。

Front Pharmacol. 2022 Dec 15;13:1032875. doi: 10.3389/fphar.2022.1032875. eCollection 2022.

Screening and Validation of PDGFRA Inhibitors Enhancing Radioiodine Sensitivity in Thyroid Cancer.血小板衍生生长因子受体A（PDGFRA）抑制剂增强甲状腺癌对放射性碘敏感性的筛选与验证

Front Pharmacol. 2022 May 12;13:883581. doi: 10.3389/fphar.2022.883581. eCollection 2022.

Computational Prediction of Compound-Protein Interactions for Orphan Targets Using CGBVS.利用 CGBVS 计算预测孤儿靶标化合物-蛋白质相互作用

Molecules. 2021 Aug 24;26(17):5131. doi: 10.3390/molecules26175131.

The anti-cancerous activity of adaptogenic herb Astragalus membranaceus.适应原性草药黄芪的抗癌活性。

Phytomedicine. 2021 Oct;91:153698. doi: 10.1016/j.phymed.2021.153698. Epub 2021 Aug 5.

Antitumor potential of Hedyotis diffusa Willd: A systematic review of bioactive constituents and underlying molecular mechanisms.白花蛇舌草抗肿瘤作用的系统评价：生物活性成分及潜在分子机制研究。

Biomed Pharmacother. 2020 Oct;130:110735. doi: 10.1016/j.biopha.2020.110735. Epub 2020 Sep 12.

Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。

Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.

Isolation, purification, structural characteristics, pharmacological activities, and combined action of Hedyotis diffusa polysaccharides: A review.白花蛇舌草多糖的分离、纯化、结构特征、药理活性及联合作用：综述。

Int J Biol Macromol. 2021 Jul 31;183:119-131. doi: 10.1016/j.ijbiomac.2021.04.139. Epub 2021 Apr 24.

Industry-scale application and evaluation of deep learning for drug target prediction.深度学习在药物靶点预测中的工业规模应用与评估

J Cheminform. 2020 Apr 19;12(1):26. doi: 10.1186/s13321-020-00428-5.

TransformerCPI: improving compound-protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments.TransformerCPI：通过基于序列的深度学习、自注意力机制和标签反转实验提高化合物-蛋白质相互作用预测。

Bioinformatics. 2020 Aug 15;36(16):4406-4414. doi: 10.1093/bioinformatics/btaa524.

DeepCPI: A Deep Learning-based Framework for Large-scale in silico Drug Screening.DeepCPI：一种基于深度学习的大规模计算机药物筛选框架。

Genomics Proteomics Bioinformatics. 2019 Oct;17(5):478-495. doi: 10.1016/j.gpb.2019.04.003. Epub 2020 Feb 6.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一种基于深度学习的化合物-蛋白质相互作用通用预测模型。

A general prediction model for compound-protein interactions based on deep learning.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献