College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, China.
VeriMake Innovation Lab, Nanjing Renmian Integrated Circuit Co., Ltd., Nanjing 210088, China.
Math Biosci Eng. 2023 Jan;20(2):2588-2608. doi: 10.3934/mbe.2023121. Epub 2022 Nov 25.
G protein-coupled receptors (GPCRs) have been the targets for more than 40% of the currently approved drugs. Although neural networks can effectively improve the accuracy of prediction with the biological activity, the result is undesirable in the limited orphan GPCRs (oGPCRs) datasets. To this end, we proposed Multi-source Transfer Learning with Graph Neural Network, called MSTL-GNN, to bridge this gap. Firstly, there are three ideal sources of data for transfer learning, oGPCRs, experimentally validated GPCRs, and invalidated GPCRs similar to the former one. Secondly, the SIMLEs format GPCRs convert to graphics, and they can be the input of Graph Neural Network (GNN) and ensemble learning for improving prediction accuracy. Finally, our experiments show that MSTL-GNN remarkably improves the prediction of GPCRs ligand activity value compared with previous studies. On average, the two evaluation indexes we adopted, R2 and Root-mean-square deviation (RMSE). Compared with the state-of-the-art work MSTL-GNN increased up to 67.13% and 17.22%, respectively. The effectiveness of MSTL-GNN in the field of GPCR Drug discovery with limited data also paves the way for other similar application scenarios.
G 蛋白偶联受体(GPCRs)是目前批准的 40%以上药物的靶点。虽然神经网络可以有效地提高生物活性的预测准确性,但在有限的孤儿 GPCR(oGPCR)数据集上,结果并不理想。为此,我们提出了一种基于图神经网络的多源迁移学习方法,称为 MSTL-GNN,以弥合这一差距。首先,对于迁移学习,有三种理想的数据来源,即 oGPCR、经过实验验证的 GPCR 和类似于前者的无效 GPCR。其次,将 SIMLEs 格式的 GPCR 转换为图形,它们可以作为图神经网络(GNN)和集成学习的输入,以提高预测准确性。最后,我们的实验表明,与之前的研究相比,MSTL-GNN 显著提高了 GPCR 配体活性值的预测。平均而言,我们采用的两个评估指标,R2 和均方根偏差(RMSE),与最先进的工作 MSTL-GNN 相比,分别提高了 67.13%和 17.22%。MSTL-GNN 在数据有限的 GPCR 药物发现领域的有效性也为其他类似的应用场景铺平了道路。