Sun Wei, Guo Chang, Wan Jing, Ren Han
School of Information Science and Technology, Qiongtai Normal University, Haikou, China.
School of Modern Information Industry, Guangzhou College of Commerce, Guangzhou, China.
PeerJ Comput Sci. 2024 Jul 23;10:e2216. doi: 10.7717/peerj-cs.2216. eCollection 2024.
Piwi-interacting RNA (piRNA) is a type of non-coding small RNA that is highly expressed in mammalian testis. PiRNA has been implicated in various human diseases, but the experimental validation of piRNA-disease associations is costly and time-consuming. In this article, a novel computational method for predicting piRNA-disease associations using a multi-channel graph variational autoencoder (MC-GVAE) is proposed. This method integrates four types of similarity networks for piRNAs and diseases, which are derived from piRNA sequences, disease semantics, piRNA Gaussian Interaction Profile (GIP) kernel, and disease GIP kernel, respectively. These networks are modeled by a graph VAE framework, which can learn low-dimensional and informative feature representations for piRNAs and diseases. Then, a multi-channel method is used to fuse the feature representations from different networks. Finally, a three-layer neural network classifier is applied to predict the potential associations between piRNAs and diseases. The method was evaluated on a benchmark dataset containing 5,002 experimentally validated associations with 4,350 piRNAs and 21 diseases, constructed from the piRDisease v1.0 database. It achieved state-of-the-art performance, with an average AUC value of 0.9310 and an AUPR value of 0.9247 under five-fold cross-validation. This demonstrates the method's effectiveness and superiority in piRNA-disease association prediction.
Piwi相互作用RNA(piRNA)是一种在哺乳动物睾丸中高度表达的非编码小RNA。PiRNA与多种人类疾病有关,但piRNA与疾病关联的实验验证成本高且耗时。本文提出了一种使用多通道图变分自编码器(MC-GVAE)预测piRNA与疾病关联的新计算方法。该方法整合了四种piRNA和疾病的相似性网络,分别来自piRNA序列、疾病语义、piRNA高斯相互作用谱(GIP)核和疾病GIP核。这些网络由图VAE框架建模,该框架可以学习piRNA和疾病的低维且信息丰富的特征表示。然后,使用多通道方法融合来自不同网络的特征表示。最后,应用三层神经网络分类器预测piRNA与疾病之间的潜在关联。该方法在一个基准数据集上进行了评估,该数据集包含从piRDisease v1.0数据库构建的4350个piRNA和21种疾病的5002个经实验验证的关联。在五折交叉验证下,它达到了当前的最佳性能,平均AUC值为0.9310,AUPR值为0.9247。这证明了该方法在piRNA与疾病关联预测中的有效性和优越性。