Suppr
超能文献

图神经网络（GNNs）和集成模型增强了在未知条件下对新的小RNA-信使核糖核酸（sRNA-mRNA）相互作用的预测。

GNNs and ensemble models enhance the prediction of new sRNA-mRNA interactions in unseen conditions.

作者信息

Cohen Shani, Rokach Lior, Veksler-Lublinsky Isana

机构信息

Department of Software & Information Systems Engineering, Faculty of Engineering, Ben-Gurion University of the Negev, 8410501, Beer-Sheva, Israel.

出版信息

BMC Bioinformatics. 2025 May 21;26(1):131. doi: 10.1186/s12859-025-06153-w.

DOI:10.1186/s12859-025-06153-w

PMID:40399818

Abstract

Bacterial small RNAs (sRNAs) are pivotal in post-transcriptional regulation, affecting functions like virulence, metabolism, and gene expression by binding specific mRNA targets. Identifying these targets is crucial to understanding sRNA regulation across species. Despite advancements in high-throughput (HT) experimental methods, they remain technically challenging and are limited to detecting sRNA-target interactions under specific environmental conditions. Therefore, computational approaches, especially machine learning (ML), are essential for identifying strong candidates for biological validation. In this paper, we hypothesize that ML models trained on large-scale interaction data from specific conditions can accurately predict new interactions in unseen conditions within the same bacterial strain. To test this, we developed models from two families: (1) graph neural networks (GNNs), including GraphRNA and kGraphRNA, that learn transformed representations of interacting sRNA-mRNA pairs via graph relationships, and (2) decision forests, sInterRF (Random Forest) and sInterXGB (XGBoost), which use various interaction features for prediction. We also proposed Summation Ensemble Models (SEM) that combine scores from multiple models. Across three seen-to-unseen conditions evaluations, our models -particularly kGraphRNA- significantly improved the area under the ROC curve (AUC) and Precision-Recall curve (PR-AUC) compared to sRNARFTarget, CopraRNA, and RNAup. The SEM model combining GraphRNA and CopraRNA outperformed CopraRNA alone on a low-throughput (LT) interactions test set (HT-to-LT evaluation). Beyond enhanced performance, our models enable target prediction for species-specific sRNAs, a capability lacking in some existing tools. Furthermore, GNN models remove the dependency on external tools like RNAplex or RNAup to compute hybridization duplex or energy features, enhancing scalability and runtime efficiency. While this study focuses on E. coli K12 MG1655 interactions, our methods are fully adaptable to predict interactions in other bacterial strains, given sufficient data for training. Our comprehensive feature importance analysis revealed the complexity of sRNA-mRNA interactions across environmental conditions, underscoring the significance of RNA sequence composition and duplex structure characteristics, like base pairing and energy factors; findings that align with biological evidence from previous studies. As HT experiments expand sRNA-target interaction data across conditions in various bacteria, our ML methods with features analysis offer promising advances in sRNA-target prediction and deeper insights into sRNA regulatory mechanisms across diverse species.

摘要

细菌小RNA（sRNA）在转录后调控中起着关键作用，通过与特定的mRNA靶标结合来影响毒力、代谢和基因表达等功能。识别这些靶标对于理解跨物种的sRNA调控至关重要。尽管高通量（HT）实验方法取得了进展，但它们在技术上仍然具有挑战性，并且仅限于检测特定环境条件下的sRNA-靶标相互作用。因此，计算方法，特别是机器学习（ML），对于识别用于生物学验证的有力候选者至关重要。在本文中，我们假设在来自特定条件的大规模相互作用数据上训练的ML模型可以准确预测同一细菌菌株中未见条件下的新相互作用。为了验证这一点，我们从两个家族开发了模型：（1）图神经网络（GNN），包括GraphRNA和kGraphRNA，它们通过图关系学习相互作用的sRNA-mRNA对的变换表示；（2）决策森林，sInterRF（随机森林）和sInterXGB（XGBoost），它们使用各种相互作用特征进行预测。我们还提出了结合多个模型分数的求和集成模型（SEM）。在三次从可见到未见条件的评估中，与sRNARFTarget、CopraRNA和RNAup相比，我们的模型——特别是kGraphRNA——显著提高了ROC曲线下面积（AUC）和精确召回曲线（PR-AUC）。在低通量（LT）相互作用测试集上（HT到LT评估），结合GraphRNA和CopraRNA的SEM模型优于单独的CopraRNA。除了性能增强外，我们的模型还能够对物种特异性sRNA进行靶标预测，这是一些现有工具所缺乏的能力。此外，GNN模型消除了对RNAplex或RNAup等外部工具的依赖，以计算杂交双链体或能量特征，提高了可扩展性和运行时效率。虽然本研究重点关注大肠杆菌K12 MG1655的相互作用，但只要有足够的训练数据，我们的方法完全适用于预测其他细菌菌株中的相互作用。我们全面的特征重要性分析揭示了跨环境条件下sRNA-mRNA相互作用的复杂性，强调了RNA序列组成和双链体结构特征（如碱基配对和能量因素）的重要性；这些发现与先前研究的生物学证据一致。随着HT实验在各种细菌中跨条件扩展sRNA-靶标相互作用数据，我们具有特征分析的ML方法在sRNA-靶标预测方面提供了有前景的进展，并对跨不同物种的sRNA调控机制有了更深入的了解。

相似文献

GNNs and ensemble models enhance the prediction of new sRNA-mRNA interactions in unseen conditions.

BMC Bioinformatics. 2025 May 21;26(1):131. doi: 10.1186/s12859-025-06153-w.

sRNARFTarget: a fast machine-learning-based approach for transcriptome-wide sRNA target prediction.

RNA Biol. 2022;19(1):44-54. doi: 10.1080/15476286.2021.2012058. Epub 2021 Dec 31.

sRNA Target Prediction Organizing Tool (SPOT) Integrates Computational and Experimental Data To Facilitate Functional Characterization of Bacterial Small RNAs.

mSphere. 2019 Jan 30;4(1):e00561-18. doi: 10.1128/mSphere.00561-18.

Hfq CLASH uncovers sRNA-target interaction networks linked to nutrient availability adaptation.

Elife. 2020 May 1;9:e54655. doi: 10.7554/eLife.54655.

A Modular Genetic System for High-Throughput Profiling and Engineering of Multi-Target Small RNAs.

Methods Mol Biol. 2018;1737:373-391. doi: 10.1007/978-1-4939-7634-8_21.

Improving prediction of bacterial sRNA regulatory targets with expression data.

NAR Genom Bioinform. 2025 May 8;7(2):lqaf055. doi: 10.1093/nargab/lqaf055. eCollection 2025 Jun.

The Phosphorolytic Exoribonucleases Polynucleotide Phosphorylase and RNase PH Stabilize sRNAs and Facilitate Regulation of Their mRNA Targets.

J Bacteriol. 2016 Nov 18;198(24):3309-3317. doi: 10.1128/JB.00624-16. Print 2016 Dec 15.

Dynamic interactions between the RNA chaperone Hfq, small regulatory RNAs, and mRNAs in live bacterial cells.

Elife. 2021 Feb 22;10:e64207. doi: 10.7554/eLife.64207.

sTarPicker: a method for efficient prediction of bacterial sRNA targets based on a two-step model for hybridization.

PLoS One. 2011;6(7):e22705. doi: 10.1371/journal.pone.0022705. Epub 2011 Jul 22.

Comparative genomics boosts target prediction for bacterial small RNAs.

Proc Natl Acad Sci U S A. 2013 Sep 10;110(37):E3487-96. doi: 10.1073/pnas.1303248110. Epub 2013 Aug 26.

本文引用的文献

Contextual AI models for single-cell protein biology.

Nat Methods. 2024 Aug;21(8):1546-1557. doi: 10.1038/s41592-024-02341-3. Epub 2024 Jul 22.

Empowering prediction of miRNA-mRNA interactions in species with limited training data through transfer learning.

Heliyon. 2024 Mar 15;10(7):e28000. doi: 10.1016/j.heliyon.2024.e28000. eCollection 2024 Apr 15.

TargetRNA3: predicting prokaryotic RNA regulatory targets with machine learning.

Genome Biol. 2023 Dec 1;24(1):276. doi: 10.1186/s13059-023-03117-2.

Scientific discovery in the age of artificial intelligence.

Nature. 2023 Aug;620(7972):47-60. doi: 10.1038/s41586-023-06221-2. Epub 2023 Aug 2.

sInterBase: a comprehensive database of Escherichia coli sRNA-mRNA interactions.

Bioinformatics. 2023 Apr 3;39(4). doi: 10.1093/bioinformatics/btad172.

An RNA sponge controls quorum sensing dynamics and biofilm formation in Vibrio cholerae.

Nat Commun. 2022 Dec 8;13(1):7585. doi: 10.1038/s41467-022-35261-x.

DeepRank-GNN: a graph neural network framework to learn patterns in protein-protein interfaces.

Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac759.

Dynamic Refolding of OxyS sRNA by the Hfq RNA Chaperone.

J Mol Biol. 2022 Sep 30;434(18):167776. doi: 10.1016/j.jmb.2022.167776. Epub 2022 Aug 4.

RNase III-CLASH of multi-drug resistant Staphylococcus aureus reveals a regulatory mRNA 3'UTR required for intermediate vancomycin resistance.

Nat Commun. 2022 Jun 22;13(1):3558. doi: 10.1038/s41467-022-31177-8.

sRNARFTarget: a fast machine-learning-based approach for transcriptome-wide sRNA target prediction.

RNA Biol. 2022;19(1):44-54. doi: 10.1080/15476286.2021.2012058. Epub 2021 Dec 31.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

图神经网络（GNNs）和集成模型增强了在未知条件下对新的小RNA-信使核糖核酸（sRNA-mRNA）相互作用的预测。

GNNs and ensemble models enhance the prediction of new sRNA-mRNA interactions in unseen conditions.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译