• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

预测经低通量实验验证的功能性长非编码 RNA。

Predicting functional long non-coding RNAs validated by low throughput experiments.

机构信息

Shandong Provincial Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University , Dezhou , China.

College of Physics and Electronic Information, Dezhou University , Dezhou , China.

出版信息

RNA Biol. 2019 Nov;16(11):1555-1564. doi: 10.1080/15476286.2019.1644590. Epub 2019 Jul 26.

DOI:10.1080/15476286.2019.1644590
PMID:31345106
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6779387/
Abstract

High-throughput techniques have uncovered hundreds and thousands of long non-coding RNAs (lncRNAs). Among them, only a tiny fraction has experimentally validated functions (EVlncRNAs) by low-throughput methods. What fraction of lncRNAs from high-throughput experiments (HTlncRNAs) is truly functional is an active subject of debate. Here, we developed the first method to distinguish EVlncRNAs from HTlncRNAs and mRNAs by using Support Vector Machines and found that EVlncRNAs can be well separated from HTlncRNAs and mRNAs with 0.6 for Matthews correlation coefficient, 64% for sensitivity, and 81% for precision for the independent human test set. The most useful features for classification are related to sequence conservations at RNA (for separating from HTlncRNAs) and protein (for separating from mRNA) levels. The method is found to be robust as the human-RNA-trained model is applicable to independent mouse RNAs with similar accuracy and to a lesser extent to plant RNAs. The method can recover newly discovered EVlncRNAs with high sensitivity. Its application to randomly selected 2000 human HTlncRNAs indicates that the majority of HTlncRNAs is probably non-functional but a large portion (nearly 30%) are likely functional. In other words, there is an ample number of lncRNAs whose specific biological roles are yet to be discovered. The method developed here is expected to speed up and reduce the cost of the discovery by prioritizing potentially functional lncRNAs prior to experimental validation. EVlncRNA-pred is available as a web server at http://biophy.dzu.edu.cn/lncrnapred/index.html . All datasets used in this study can be obtained from the same website.

摘要

高通量技术已经揭示了数百种乃至数千种长非编码 RNA(lncRNA)。其中,只有一小部分通过低通量方法验证了功能(EVlncRNA)。高通量实验(HTlncRNA)中真正有功能的 lncRNA 比例是一个活跃的争论话题。在这里,我们开发了第一个通过支持向量机(Support Vector Machine,SVM)区分 EVlncRNA 和 HTlncRNA 与 mRNA 的方法,发现 EVlncRNA 可以与 HTlncRNA 和 mRNA 很好地区分开来,在独立的人类测试集上,马修斯相关系数(Matthews correlation coefficient,MCC)为 0.6,灵敏度为 64%,精度为 81%。分类最有用的特征与 RNA(用于与 HTlncRNA 区分)和蛋白质(用于与 mRNA 区分)水平的序列保守性有关。该方法具有很强的稳健性,因为在人类 RNA 上训练的模型可以应用于独立的老鼠 RNA,并且具有相似的准确性,在较小程度上也可以应用于植物 RNA。该方法可以以高灵敏度恢复新发现的 EVlncRNA。将其应用于随机选择的 2000 个人类 HTlncRNA 表明,大多数 HTlncRNA 可能是无功能的,但很大一部分(近 30%)可能是有功能的。换句话说,有相当数量的 lncRNA 的特定生物学作用尚未被发现。该方法有望通过在实验验证之前优先考虑潜在功能的 lncRNA 来加速和降低发现的成本。EVlncRNA-pred 可作为一个网络服务器,网址为:http://biophy.dzu.edu.cn/lncrnapred/index.html。本研究中使用的所有数据集都可以从同一网站获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8103/6779387/925c0a62f070/krnb-16-11-1644590-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8103/6779387/728f7812ba7a/krnb-16-11-1644590-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8103/6779387/c97621eb8c8a/krnb-16-11-1644590-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8103/6779387/e22fe28b89cf/krnb-16-11-1644590-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8103/6779387/925c0a62f070/krnb-16-11-1644590-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8103/6779387/728f7812ba7a/krnb-16-11-1644590-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8103/6779387/c97621eb8c8a/krnb-16-11-1644590-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8103/6779387/e22fe28b89cf/krnb-16-11-1644590-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8103/6779387/925c0a62f070/krnb-16-11-1644590-g004.jpg

相似文献

1
Predicting functional long non-coding RNAs validated by low throughput experiments.预测经低通量实验验证的功能性长非编码 RNA。
RNA Biol. 2019 Nov;16(11):1555-1564. doi: 10.1080/15476286.2019.1644590. Epub 2019 Jul 26.
2
EVlncRNA-Dpred: improved prediction of experimentally validated lncRNAs by deep learning.EVlncRNA-Dpred:通过深度学习提高实验验证的 lncRNA 预测。
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac583.
3
EVLncRNAs: a manually curated database for long non-coding RNAs validated by low-throughput experiments.EVLncRNAs:一个经过手工整理的数据库,包含经过低通量实验验证的长非编码 RNA。
Nucleic Acids Res. 2018 Jan 4;46(D1):D100-D105. doi: 10.1093/nar/gkx677.
4
Experimentally Validated Plant lncRNAs in EVLncRNAs Database.EVLncRNAs数据库中经过实验验证的植物长链非编码RNA
Methods Mol Biol. 2019;1933:431-437. doi: 10.1007/978-1-4939-9045-0_27.
5
EVLncRNAs 2.0: an updated database of manually curated functional long non-coding RNAs validated by low-throughput experiments.EVLncRNAs 2.0:一个经手动注释和低通量实验验证的具有功能的长非编码 RNA 数据库。
Nucleic Acids Res. 2021 Jan 8;49(D1):D86-D91. doi: 10.1093/nar/gkaa1076.
6
PLEK: a tool for predicting long non-coding RNAs and messenger RNAs based on an improved k-mer scheme.PLEK:一种基于改进的k-mer方案预测长链非编码RNA和信使RNA的工具。
BMC Bioinformatics. 2014 Sep 19;15(1):311. doi: 10.1186/1471-2105-15-311.
7
A Support Vector Machine based method to distinguish long non-coding RNAs from protein coding transcripts.基于支持向量机的方法区分长非编码 RNA 与蛋白质编码转录本。
BMC Genomics. 2017 Oct 18;18(1):804. doi: 10.1186/s12864-017-4178-4.
8
LncRNApred: Classification of Long Non-Coding RNAs and Protein-Coding Transcripts by the Ensemble Algorithm with a New Hybrid Feature.LncRNApred:基于具有新型混合特征的集成算法对长链非编码RNA和蛋白质编码转录本进行分类
PLoS One. 2016 May 26;11(5):e0154567. doi: 10.1371/journal.pone.0154567. eCollection 2016.
9
EVLncRNAs 3.0: an updated comprehensive database for manually curated functional long non-coding RNAs validated by low-throughput experiments.EVLncRNAs 3.0:一个经过更新的全面数据库,包含经过人工精心整理的、通过低通量实验验证的具有功能的长非编码 RNA。
Nucleic Acids Res. 2024 Jan 5;52(D1):D98-D106. doi: 10.1093/nar/gkad1057.
10
CANTATAdb 2.0: Expanding the Collection of Plant Long Noncoding RNAs.CANTATAdb 2.0:扩展植物长链非编码RNA集合
Methods Mol Biol. 2019;1933:415-429. doi: 10.1007/978-1-4939-9045-0_26.

引用本文的文献

1
MARS and RNAcmap3: The Master Database of All Possible RNA Sequences Integrated with RNAcmap for RNA Homology Search.MARS 和 RNAcmap3:整合了 RNAcmap 的所有可能 RNA 序列的主数据库,用于 RNA 同源性搜索。
Genomics Proteomics Bioinformatics. 2024 May 9;22(1). doi: 10.1093/gpbjnl/qzae018.
2
EVlncRNA-Dpred: improved prediction of experimentally validated lncRNAs by deep learning.EVlncRNA-Dpred:通过深度学习提高实验验证的 lncRNA 预测。
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac583.
3
New Sights Into Long Non-Coding RNA LINC01133 in Cancer.

本文引用的文献

1
UniProt: the universal protein knowledgebase.通用蛋白质知识库:UniProt
Nucleic Acids Res. 2018 Mar 16;46(5):2699. doi: 10.1093/nar/gky092.
2
B-factor profile prediction for RNA flexibility using support vector machines.基于支持向量机的 RNA 柔性 B 因子预测。
J Comput Chem. 2018 Mar 30;39(8):407-411. doi: 10.1002/jcc.25124. Epub 2017 Nov 21.
3
NONCODEV5: a comprehensive annotation database for long non-coding RNAs.NONCODEV5:一个全面的长非编码 RNA 注释数据库。
癌症中长链非编码RNA LINC01133的新见解
Front Oncol. 2022 Jun 7;12:908162. doi: 10.3389/fonc.2022.908162. eCollection 2022.
4
From "Dark Matter" to "Star": Insight Into the Regulation Mechanisms of Plant Functional Long Non-Coding RNAs.从“暗物质”到“明星”:植物功能性长链非编码RNA调控机制的洞察
Front Plant Sci. 2021 Jun 7;12:650926. doi: 10.3389/fpls.2021.650926. eCollection 2021.
5
EVLncRNAs 2.0: an updated database of manually curated functional long non-coding RNAs validated by low-throughput experiments.EVLncRNAs 2.0:一个经手动注释和低通量实验验证的具有功能的长非编码 RNA 数据库。
Nucleic Acids Res. 2021 Jan 8;49(D1):D86-D91. doi: 10.1093/nar/gkaa1076.
6
NAMS webserver: coding potential assessment and functional annotation of plant transcripts.NAMS网络服务器:植物转录本的编码潜能评估与功能注释
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa200.
Nucleic Acids Res. 2018 Jan 4;46(D1):D308-D314. doi: 10.1093/nar/gkx1107.
4
Rfam 13.0: shifting to a genome-centric resource for non-coding RNA families.RFAM 13.0:转向以基因组为中心的非编码 RNA 家族资源
Nucleic Acids Res. 2018 Jan 4;46(D1):D335-D342. doi: 10.1093/nar/gkx1038.
5
EVLncRNAs: a manually curated database for long non-coding RNAs validated by low-throughput experiments.EVLncRNAs:一个经过手工整理的数据库,包含经过低通量实验验证的长非编码 RNA。
Nucleic Acids Res. 2018 Jan 4;46(D1):D100-D105. doi: 10.1093/nar/gkx677.
6
A Novel Long Non-Coding RNA, SOX21-AS1, Indicates a Poor Prognosis and Promotes Lung Adenocarcinoma Proliferation.一种新型长链非编码RNA,SOX21-AS1,提示预后不良并促进肺腺癌增殖。
Cell Physiol Biochem. 2017;42(5):1857-1869. doi: 10.1159/000479543. Epub 2017 Jul 27.
7
An Upper Limit on the Functional Fraction of the Human Genome.人类基因组功能部分的上限
Genome Biol Evol. 2017 Jul 1;9(7):1880-1885. doi: 10.1093/gbe/evx121.
8
lncRInter: A database of experimentally validated long non-coding RNA interaction.lncRInter:一个经过实验验证的长链非编码RNA相互作用数据库。
J Genet Genomics. 2017 May 20;44(5):265-268. doi: 10.1016/j.jgg.2017.01.004. Epub 2017 Jan 27.
9
An atlas of human long non-coding RNAs with accurate 5' ends.具有精确5'端的人类长链非编码RNA图谱。
Nature. 2017 Mar 9;543(7644):199-204. doi: 10.1038/nature21374. Epub 2017 Mar 1.
10
Comprehensive analysis of long non-coding RNAs highlights their spatio-temporal expression patterns and evolutional conservation in Sus scrofa.全面分析长非编码 RNA,突出其在猪中的时空表达模式和进化保守性。
Sci Rep. 2017 Feb 24;7:43166. doi: 10.1038/srep43166.