TEC-miTarget：基于 RNA 序列深度学习的 miRNA 靶基因预测增强方法。

TEC-miTarget: enhancing microRNA target prediction based on deep learning of ribonucleic acid sequences.

机构信息

Peng Cheng Laboratory, Shenzhen, 518055, China.

Tsinghua Shenzhen International Graduate School, Shenzhen, 518055, China.

出版信息

BMC Bioinformatics. 2024 Apr 20;25(1):159. doi: 10.1186/s12859-024-05780-z.

DOI:10.1186/s12859-024-05780-z

PMID:38643080

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11032603/

Abstract

BACKGROUND

MicroRNAs play a critical role in regulating gene expression by binding to specific target sites within gene transcripts, making the identification of microRNA targets a prominent focus of research. Conventional experimental methods for identifying microRNA targets are both time-consuming and expensive, prompting the development of computational tools for target prediction. However, the existing computational tools exhibit limited performance in meeting the demands of practical applications, highlighting the need to improve the performance of microRNA target prediction models.

RESULTS

In this paper, we utilize the most popular natural language processing and computer vision technologies to propose a novel approach, called TEC-miTarget, for microRNA target prediction based on transformer encoder and convolutional neural networks. TEC-miTarget treats RNA sequences as a natural language and encodes them using a transformer encoder, a widely used encoder in natural language processing. It then combines the representations of a pair of microRNA and its candidate target site sequences into a contact map, which is a three-dimensional array similar to a multi-channel image. Therefore, the contact map's features are extracted using a four-layer convolutional neural network, enabling the prediction of interactions between microRNA and its candidate target sites. We applied a series of comparative experiments to demonstrate that TEC-miTarget significantly improves microRNA target prediction, compared with existing state-of-the-art models. Our approach is the first approach to perform comparisons with other approaches at both sequence and transcript levels. Furthermore, it is the first approach compared with both deep learning-based and seed-match-based methods. We first compared TEC-miTarget's performance with approaches at the sequence level, and our approach delivers substantial improvements in performance using the same datasets and evaluation metrics. Moreover, we utilized TEC-miTarget to predict microRNA targets in long mRNA sequences, which involves two steps: selecting candidate target site sequences and applying sequence-level predictions. We finally showed that TEC-miTarget outperforms other approaches at the transcript level, including the popular seed match methods widely used in previous years.

CONCLUSIONS

We propose a novel approach for predicting microRNA targets at both sequence and transcript levels, and demonstrate that our approach outperforms other methods based on deep learning or seed match. We also provide our approach as an easy-to-use software, TEC-miTarget, at https://github.com/tingpeng17/TEC-miTarget . Our results provide new perspectives for microRNA target prediction.

摘要

背景

MicroRNAs 通过与基因转录本中的特定靶位点结合，在调节基因表达中发挥着关键作用，因此鉴定 MicroRNA 靶标成为研究的重点。传统的鉴定 MicroRNA 靶标的实验方法既耗时又昂贵，这促使了计算工具的发展，用于靶标预测。然而，现有的计算工具在满足实际应用需求方面表现出有限的性能，这凸显了改进 MicroRNA 靶标预测模型性能的必要性。

结果

在本文中，我们利用最流行的自然语言处理和计算机视觉技术，提出了一种新的方法，称为 TEC-miTarget，用于基于变压器编码器和卷积神经网络的 MicroRNA 靶标预测。TEC-miTarget 将 RNA 序列视为自然语言，并使用变压器编码器对其进行编码，变压器编码器是自然语言处理中广泛使用的编码器。然后，它将一对 MicroRNA 和其候选靶位点序列的表示组合成一个接触图，接触图是一个类似于多通道图像的三维数组。因此，使用四层卷积神经网络提取接触图的特征，从而预测 MicroRNA 和其候选靶位点之间的相互作用。我们进行了一系列对比实验，证明与现有最先进的模型相比，TEC-miTarget 显著提高了 MicroRNA 靶标预测的性能。我们的方法是第一个在序列和转录水平上与其他方法进行比较的方法。此外，它是第一个与基于深度学习和种子匹配的方法进行比较的方法。我们首先在序列水平上比较了 TEC-miTarget 的性能，并且我们的方法在使用相同数据集和评估指标时显著提高了性能。此外，我们利用 TEC-miTarget 预测长 mRNA 序列中的 MicroRNA 靶标，这涉及两个步骤：选择候选靶位点序列和应用序列水平的预测。最后，我们表明 TEC-miTarget 在转录水平上优于其他方法，包括过去几年广泛使用的流行种子匹配方法。

结论

我们提出了一种用于预测序列和转录水平 MicroRNA 靶标的新方法，并证明我们的方法优于基于深度学习或种子匹配的其他方法。我们还提供了我们的方法作为一个易于使用的软件，TEC-miTarget，在 https://github.com/tingpeng17/TEC-miTarget 。我们的结果为 MicroRNA 靶标预测提供了新的视角。