使用机器学习分类器对真正的细菌小RNA进行优先级排序。

Prioritizing bona fide bacterial small RNAs with machine learning classifiers.

作者信息

Eppenhof Erik J J, Peña-Castillo Lourdes

机构信息

Department of Artificial Intelligence, Radboud University Nijmegen, Nijmegen, Netherlands.

Department of Biology, Memorial University of Newfoundland, St. John's, Canada.

出版信息

PeerJ. 2019 Jan 24;7:e6304. doi: 10.7717/peerj.6304. eCollection 2019.

DOI:10.7717/peerj.6304

PMID:30697489

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6348098/

Abstract

Bacterial small (sRNAs) are involved in the control of several cellular processes. Hundreds of putative sRNAs have been identified in many bacterial species through RNA sequencing. The existence of putative sRNAs is usually validated by Northern blot analysis. However, the large amount of novel putative sRNAs reported in the literature makes it impractical to validate each of them in the wet lab. In this work, we applied five machine learning approaches to construct twenty models to discriminate bona fide sRNAs from random genomic sequences in five bacterial species. Sequences were represented using seven features including free energy of their predicted secondary structure, their distances to the closest predicted promoter site and Rho-independent terminator, and their distance to the closest open reading frames (ORFs). To automatically calculate these features, we developed an sRNA Characterization Pipeline (sRNACharP). All seven features used in the classification task contributed positively to the performance of the predictive models. The best performing model obtained a median precision of 100% at 10% recall and of 64% at 40% recall across all five bacterial species, and it outperformed previous published approaches on two benchmark datasets in terms of precision and recall. Our results indicate that even though there is limited sRNA sequence conservation across different bacterial species, there are intrinsic features in the genomic context of sRNAs that are conserved across taxa. We show that these features are utilized by machine learning approaches to learn a species-independent model to prioritize bona fide bacterial sRNAs.

摘要

细菌小RNA（sRNAs）参与多种细胞过程的调控。通过RNA测序，在许多细菌物种中已鉴定出数百种假定的sRNAs。假定sRNAs的存在通常通过Northern印迹分析来验证。然而，文献中报道的大量新型假定sRNAs使得在湿实验室中对它们逐一进行验证变得不切实际。在这项工作中，我们应用了五种机器学习方法构建了二十个模型，以区分五个细菌物种中真正的sRNAs与随机基因组序列。使用七个特征来表示序列，包括其预测二级结构的自由能、与最接近的预测启动子位点和不依赖Rho的终止子的距离，以及与最接近的开放阅读框（ORFs）的距离。为了自动计算这些特征，我们开发了一个sRNA特征分析管道（sRNACharP）。分类任务中使用的所有七个特征对预测模型的性能都有积极贡献。表现最佳的模型在所有五个细菌物种中，召回率为10%时中位数精度为100%，召回率为40%时中位数精度为64%，并且在精度和召回率方面优于之前在两个基准数据集上发表的方法。我们的结果表明，尽管不同细菌物种之间sRNA序列保守性有限，但sRNAs的基因组背景中存在跨分类群保守的内在特征。我们表明，机器学习方法利用这些特征来学习一个不依赖物种的模型，以对真正的细菌sRNAs进行优先级排序。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ddc/6348098/b7bc2e3d8e62/peerj-07-6304-g001.jpg

相似文献

Prioritizing bona fide bacterial small RNAs with machine learning classifiers.使用机器学习分类器对真正的细菌小RNA进行优先级排序。

PeerJ. 2019 Jan 24;7:e6304. doi: 10.7717/peerj.6304. eCollection 2019.

Genome-wide identification and characterization of small RNAs in Rhodobacter capsulatus and identification of small RNAs affected by loss of the response regulator CtrA.荚膜红细菌中小RNA的全基因组鉴定与特征分析以及受应答调节因子CtrA缺失影响的小RNA的鉴定

RNA Biol. 2017 Jul 3;14(7):914-925. doi: 10.1080/15476286.2017.1306175. Epub 2017 Mar 15.

Common and phylogenetically widespread coding for peptides by bacterial small RNAs.细菌小RNA对肽进行编码的现象普遍存在且在系统发育上广泛存在。

BMC Genomics. 2017 Jul 21;18(1):553. doi: 10.1186/s12864-017-3932-y.

Prediction of Bacterial sRNAs Using Sequence-Derived Features and Machine Learning.利用序列衍生特征和机器学习预测细菌小RNA

Bioinform Biol Insights. 2022 Aug 18;16:11779322221118335. doi: 10.1177/11779322221118335. eCollection 2022.

sRNA Target Prediction Organizing Tool (SPOT) Integrates Computational and Experimental Data To Facilitate Functional Characterization of Bacterial Small RNAs.sRNA 靶标预测组织工具 (SPOT) 整合计算和实验数据，以促进细菌小 RNA 的功能表征。

mSphere. 2019 Jan 30;4(1):e00561-18. doi: 10.1128/mSphere.00561-18.

Assessment of sRNAs in .对……中sRNA的评估。（你提供的原文不完整，缺少具体评估对象，以上是按照格式要求尽量翻译的结果）

Front Microbiol. 2018 Feb 20;9:228. doi: 10.3389/fmicb.2018.00228. eCollection 2018.

Prediction of bacterial small RNAs in the RsmA (CsrA) and ToxT pathways: a machine learning approach.RsmA（CsrA）和ToxT途径中细菌小RNA的预测：一种机器学习方法。

BMC Genomics. 2017 Aug 22;18(1):645. doi: 10.1186/s12864-017-4057-z.

Temperature-dependent sRNA transcriptome of the Lyme disease spirochete.莱姆病螺旋体的温度依赖性小RNA转录组

BMC Genomics. 2017 Jan 5;18(1):28. doi: 10.1186/s12864-016-3398-3.

sRNAPredict: an integrative computational approach to identify sRNAs in bacterial genomes.sRNAPredict：一种用于识别细菌基因组中sRNA的综合计算方法。

Nucleic Acids Res. 2005 Jul 26;33(13):4096-105. doi: 10.1093/nar/gki715. Print 2005.

Bacterial small RNAs in the Genus Rickettsia.立克次氏体属中的细菌小RNA

BMC Genomics. 2015 Dec 18;16:1075. doi: 10.1186/s12864-015-2293-7.

引用本文的文献

sRNAdeep: a novel tool for bacterial sRNA prediction based on DistilBERT encoding mode and deep learning algorithms.sRNAdeep：一种基于 DistilBERT 编码模式和深度学习算法的新型细菌 sRNA 预测工具。

BMC Genomics. 2024 Oct 31;25(1):1021. doi: 10.1186/s12864-024-10951-6.

Methods for Bioinformatic Prediction of Genuine sRNAs from Outer Membrane Vesicles.生物信息学预测外膜囊泡中真正的 sRNAs 的方法。

Methods Mol Biol. 2024;2843:37-54. doi: 10.1007/978-1-0716-4055-5_4.

Bacterial small RNAs may mediate immune response differences seen in respiratory syncytial virus versus rhinovirus bronchiolitis.细菌小 RNA 可能介导呼吸道合胞病毒与鼻病毒毛细支气管炎中观察到的免疫反应差异。

Front Immunol. 2024 Feb 12;15:1330991. doi: 10.3389/fimmu.2024.1330991. eCollection 2024.

Prediction of Bacterial sRNAs Using Sequence-Derived Features and Machine Learning.利用序列衍生特征和机器学习预测细菌小RNA

Bioinform Biol Insights. 2022 Aug 18;16:11779322221118335. doi: 10.1177/11779322221118335. eCollection 2022.

BioAutoML: automated feature engineering and metalearning to predict noncoding RNAs in bacteria.生物自动化机器学习：自动化特征工程和元学习，用于预测细菌中的非编码 RNA。

Brief Bioinform. 2022 Jul 18;23(4). doi: 10.1093/bib/bbac218.

Salmonella Typhimurium reprograms macrophage metabolism via T3SS effector SopE2 to promote intracellular replication and virulence.鼠伤寒沙门氏菌通过 T3SS 效应蛋白 SopE2 重编程巨噬细胞代谢，促进细胞内复制和毒力。

Nat Commun. 2021 Feb 9;12(1):879. doi: 10.1038/s41467-021-21186-4.

Applying a New REFINE Approach in Identifies Novel sRNAs That Confer Improved Stress Tolerance Phenotypes.应用一种新的REFINE方法鉴定出赋予改善的胁迫耐受性表型的新型小RNA。

Front Microbiol. 2020 Jan 10;10:2987. doi: 10.3389/fmicb.2019.02987. eCollection 2019.

本文引用的文献

Nextflow enables reproducible computational workflows.Nextflow支持可重复的计算工作流程。

Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820.

An improved method for identification of small non-coding RNAs in bacteria using support vector machine.利用支持向量机改进细菌中小非编码 RNA 的鉴定方法。

Sci Rep. 2017 Apr 6;7:46070. doi: 10.1038/srep46070.

RNA Biol. 2017 Jul 3;14(7):914-925. doi: 10.1080/15476286.2017.1306175. Epub 2017 Mar 15.

bTSSfinder: a novel tool for the prediction of promoters in cyanobacteria and Escherichia coli.bTSSfinder：一种用于预测蓝藻和大肠杆菌中启动子的新型工具。

Bioinformatics. 2017 Feb 1;33(3):334-340. doi: 10.1093/bioinformatics/btw629.

SigmoID: a user-friendly tool for improving bacterial genome annotation through analysis of transcription control signals.SigmoID：一种通过转录控制信号分析改进细菌基因组注释的用户友好工具。

PeerJ. 2016 May 24;4:e2056. doi: 10.7717/peerj.2056. eCollection 2016.

Term-seq reveals abundant ribo-regulation of antibiotics resistance in bacteria.Term-seq技术揭示了细菌中抗生素耐药性的大量核糖调控。

Science. 2016 Apr 8;352(6282):aad9822. doi: 10.1126/science.aad9822.

RNA sequencing uncovers antisense RNAs and novel small RNAs in Streptococcus pyogenes.RNA测序揭示了化脓性链球菌中的反义RNA和新型小RNA。

RNA Biol. 2016;13(2):177-95. doi: 10.1080/15476286.2015.1110674.

RegulonDB version 9.0: high-level integration of gene regulation, coexpression, motif clustering and beyond.RegulonDB 9.0版本：基因调控、共表达、基序聚类及其他方面的高级整合。

Nucleic Acids Res. 2016 Jan 4;44(D1):D133-43. doi: 10.1093/nar/gkv1156. Epub 2015 Nov 2.

The impact of Docker containers on the performance of genomic pipelines.Docker容器对基因组分析流程性能的影响。

PeerJ. 2015 Sep 24;3:e1273. doi: 10.7717/peerj.1273. eCollection 2015.

Small RNAs in bacteria and archaea: who they are, what they do, and how they do it.细菌和古细菌中的小RNA：它们是什么、做什么以及如何发挥作用。

Adv Genet. 2015;90:133-208. doi: 10.1016/bs.adgen.2015.05.001. Epub 2015 Jul 3.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用机器学习分类器对真正的细菌小RNA进行优先级排序。

Prioritizing bona fide bacterial small RNAs with machine learning classifiers.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献