Suppr超能文献

头尾转移:一种提高图神经网络方法预测稀疏 ncRNA-蛋白质相互作用性能的有效采样方法。

HeadTailTransfer: An efficient sampling method to improve the performance of graph neural network method in predicting sparse ncRNA-protein interactions.

机构信息

Wenzhou University of Technology, Wenzhou, 325000, China.

Neher's Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, 999078, Macao Special Administrative Region of China; Wenzhou University of Technology, Wenzhou, 325000, China.

出版信息

Comput Biol Med. 2023 May;157:106783. doi: 10.1016/j.compbiomed.2023.106783. Epub 2023 Mar 15.

Abstract

Noncoding RNA (ncRNA) is a functional RNA derived from DNA transcription, and most transcribed genes are transcribed into ncRNA. ncRNA is not directly involved in the translation of proteins, but it can participate in gene expression in cells and affect protein synthesis, thus playing an important role in biological processes such as growth, proliferation, metabolism, and information transmission. Therefore, understanding the interaction between ncRNA and protein is the basis for studying ncRNA regulation of protein-related biological activities. However, it is very expensive and time-consuming to verify ncRNA-protein interaction through biological experiments, and prediction methods based on machine learning have been developed rapidly. Recently, the graph neural network model (GNN) stands out for its excellent performance, but lacks a general framework for predicting ncRNA-protein interactions. We propose a GNN-based framework to predict ncRNA-protein interactions, which can utilize topological structure information to complete prediction tasks faster and more accurately. Meanwhile, for some smaller datasets, many ncRNA nodes lack neighbor information, resulting in lower prediction accuracy. For some larger datasets, the long-tail distribution causes the prediction of the tail nodes (sparse nodes linking few neighbors) to be affected. Therefore, we propose a new sampling method named HeadTailTransfer to mitigate these effects. Experimental results illustrate the effectiveness of this method. Especially for task-specific prediction on the RPI369 dataset in the Graphsage-based neural network framework, the AUC and ACC values increased from 56.8% and 52.2% to 80.2% and 71.8%, respectively. Our data and codes are available: https://github.com/kkkayle/HeadTailTransfer.

摘要

非编码 RNA(ncRNA)是一种源自 DNA 转录的功能性 RNA,大多数转录基因都被转录为 ncRNA。ncRNA 不直接参与蛋白质的翻译,但它可以参与细胞中的基因表达,并影响蛋白质的合成,从而在生长、增殖、代谢和信息传递等生物过程中发挥重要作用。因此,了解 ncRNA 与蛋白质的相互作用是研究 ncRNA 调节与蛋白质相关的生物活性的基础。然而,通过生物实验验证 ncRNA-蛋白质相互作用非常昂贵且耗时,并且基于机器学习的预测方法已经迅速发展。最近,图神经网络模型(GNN)因其出色的性能而脱颖而出,但缺乏预测 ncRNA-蛋白质相互作用的通用框架。我们提出了一种基于 GNN 的框架来预测 ncRNA-蛋白质相互作用,该框架可以利用拓扑结构信息更快、更准确地完成预测任务。同时,对于一些较小的数据集,许多 ncRNA 节点缺乏邻居信息,导致预测精度较低。对于一些较大的数据集,长尾分布导致尾部节点(与少数邻居相连的稀疏节点)的预测受到影响。因此,我们提出了一种新的采样方法,名为 HeadTailTransfer,以减轻这些影响。实验结果说明了该方法的有效性。特别是在 Graphsage 神经网络框架中的 RPI369 数据集的特定任务预测中,AUC 和 ACC 值从 56.8%和 52.2%分别提高到 80.2%和 71.8%。我们的数据和代码可在以下网址获取:https://github.com/kkkayle/HeadTailTransfer。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验