• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

具有保守剪接位点的人类编码和非编码转录本的计算发现。

Computational discovery of human coding and non-coding transcripts with conserved splice sites.

机构信息

Bioinformatics Group, Department of Computer Science, University of Leipzig, Leipzig, Germany.

出版信息

Bioinformatics. 2011 Jul 15;27(14):1894-900. doi: 10.1093/bioinformatics/btr314. Epub 2011 May 26.

DOI:10.1093/bioinformatics/btr314
PMID:21622663
Abstract

MOTIVATION

Long non-coding RNAs (lncRNAs) resemble protein-coding mRNAs but do not encode proteins. Most lncRNAs are under lower sequence constraints than protein-coding genes and lack conserved secondary structures, making it hard to predict them computationally.

RESULTS

We introduce an approach to predict spliced lncRNAs in vertebrate genomes combining comparative genomics and machine learning. It is based on detecting signatures of characteristic splice site evolution in vertebrate whole genome alignments. First, we predict individual splice sites, then assemble compatible sites into exon candidates, and finally predict multi-exon transcripts. Using a novel method to evaluate typical splice site substitution patterns that explicitly takes the species phylogeny into account, we show that individual splice sites can be accurately predicted. Since our approach relies only on predicted splice sites, it can uncover both coding and non-coding exons. We show that our predicted exons and partial transcripts are mostly non-coding and lack conserved secondary structures. These exons are of particular interest, since existing computational approaches cannot detect them. Transcriptome sequencing data indicate tissue-specific expression patterns of predicted exons and there is evidence that increasing sequencing depth and breadth will validate additional predictions. We also found a significant enrichment of predicted exons that form multi-exon transcript parts, and we experimentally validate such a novel multi-exon gene. Overall, we obtain 336 novel multi-exon transcript predictions from human intergenic regions. Our results indicate the existence of novel human transcripts that are conserved in evolution and our approach contributes to the completion of the human transcript catalog.

AVAILABILITY AND IMPLEMENTATION

Predicted human splice sites, exons and gene structures together with a Perl implementation of the tree-based log-odds scoring and a supplementary PDF file containing additional figures and tables are available at: http://www.bioinf.uni-leipzig.de/publications/supplements/10-010. The five experimentally confirmed partial transcript isoforms have been deposited in GenBank under accession numbers HM587422-HM587426.

摘要

动机

长非编码 RNA(lncRNA)类似于编码蛋白质的 mRNA,但不编码蛋白质。大多数 lncRNA 的序列约束比编码蛋白的基因低,并且缺乏保守的二级结构,这使得很难通过计算进行预测。

结果

我们介绍了一种在脊椎动物基因组中预测拼接 lncRNA 的方法,该方法结合了比较基因组学和机器学习。它基于在脊椎动物全基因组比对中检测特征剪接位点进化的特征。首先,我们预测单个剪接位点,然后将兼容的位点组装成外显子候选物,最后预测多外显子转录本。使用一种新的方法来评估典型的剪接位点替代模式,该方法明确考虑了物种系统发育,我们表明可以准确预测单个剪接位点。由于我们的方法仅依赖于预测的剪接位点,因此它可以揭示编码和非编码外显子。我们表明,我们预测的外显子和部分转录本主要是非编码的,并且缺乏保守的二级结构。这些外显子特别有趣,因为现有的计算方法无法检测到它们。转录组测序数据表明预测外显子具有组织特异性表达模式,并且有证据表明增加测序深度和广度将验证更多的预测。我们还发现了形成多外显子转录部分的预测外显子的显著富集,并且我们实验验证了这样一个新的多外显子基因。总体而言,我们从人类基因间区获得了 336 个新的多外显子转录本预测。我们的结果表明存在新的人类转录本,这些转录本在进化中是保守的,我们的方法有助于完成人类转录本目录。

可用性和实施

预测的人类剪接位点、外显子和基因结构以及基于树的对数几率评分的 Perl 实现,以及包含更多图形和表格的补充 PDF 文件可在以下网址获得:http://www.bioinf.uni-leipzig.de/publications/supplements/10-010。五个经实验验证的部分转录本异构体已在 GenBank 中以 HM587422-HM587426 的 accession numbers 提交。

相似文献

1
Computational discovery of human coding and non-coding transcripts with conserved splice sites.具有保守剪接位点的人类编码和非编码转录本的计算发现。
Bioinformatics. 2011 Jul 15;27(14):1894-900. doi: 10.1093/bioinformatics/btr314. Epub 2011 May 26.
2
Exonization of transposed elements: A challenge and opportunity for evolution.转座子外显子化:进化的挑战与机遇。
Biochimie. 2011 Nov;93(11):1928-34. doi: 10.1016/j.biochi.2011.07.014. Epub 2011 Jul 26.
3
Comparison of splice sites reveals that long noncoding RNAs are evolutionarily well conserved.剪接位点的比较表明,长链非编码RNA在进化上具有良好的保守性。
RNA. 2015 May;21(5):801-12. doi: 10.1261/rna.046342.114. Epub 2015 Mar 23.
4
Widespread splicing of repetitive element loci into coding regions of gene transcripts.重复元件位点广泛剪接到基因转录本的编码区域。
Hum Mol Genet. 2016 Nov 15;25(22):4962-4982. doi: 10.1093/hmg/ddw321.
5
Comparative analysis of sequence features involved in the recognition of tandem splice sites.串联剪接位点识别中涉及的序列特征的比较分析。
BMC Genomics. 2008 Apr 30;9:202. doi: 10.1186/1471-2164-9-202.
6
Vertebrate gene finding from multiple-species alignments using a two-level strategy.使用两级策略从多物种比对中寻找脊椎动物基因。
Genome Biol. 2006;7 Suppl 1(Suppl 1):S6.1-12. doi: 10.1186/gb-2006-7-s1-s6. Epub 2006 Aug 7.
7
Coding exon-structure aware realigner (CESAR) utilizes genome alignments for accurate comparative gene annotation.编码外显子结构感知重排器(CESAR)利用基因组比对进行准确的比较基因注释。
Nucleic Acids Res. 2016 Jun 20;44(11):e103. doi: 10.1093/nar/gkw210. Epub 2016 Mar 25.
8
LEMONS - A Tool for the Identification of Splice Junctions in Transcriptomes of Organisms Lacking Reference Genomes.LEMONS——一种用于识别缺乏参考基因组的生物体转录组中剪接位点的工具。
PLoS One. 2015 Nov 25;10(11):e0143329. doi: 10.1371/journal.pone.0143329. eCollection 2015.
9
Conserved introns reveal novel transcripts in Drosophila melanogaster.保守内含子揭示了黑腹果蝇中的新转录本。
Genome Res. 2009 Jul;19(7):1289-300. doi: 10.1101/gr.090050.108. Epub 2009 May 20.
10
Revised genomic structure of the human ghrelin gene and identification of novel exons, alternative splice variants and natural antisense transcripts.人类胃饥饿素基因的修订基因组结构以及新型外显子、可变剪接变体和天然反义转录本的鉴定。
BMC Genomics. 2007 Aug 30;8:298. doi: 10.1186/1471-2164-8-298.

引用本文的文献

1
Conservation assessment of human splice site annotation based on a 470-genome alignment.基于470个基因组比对的人类剪接位点注释的保守性评估。
Nucleic Acids Res. 2025 Mar 20;53(6). doi: 10.1093/nar/gkaf184.
2
Comparative RNA Genomics.比较 RNA 基因组学。
Methods Mol Biol. 2024;2802:347-393. doi: 10.1007/978-1-0716-3838-5_12.
3
Conservation assessment of human splice site annotation based on a 470-genome alignment.基于470个基因组比对的人类剪接位点注释的保守性评估
bioRxiv. 2025 Mar 15:2023.12.01.569581. doi: 10.1101/2023.12.01.569581.
4
Comparative genomics in the search for conserved long noncoding RNAs.比较基因组学在保守长非编码 RNA 研究中的应用。
Essays Biochem. 2021 Oct 27;65(4):741-749. doi: 10.1042/EBC20200069.
5
Comparison of long noncoding RNA between muscles and adipose tissues in beef cattle.肉牛肌肉组织与脂肪组织中长链非编码RNA的比较
Anim Cells Syst (Seoul). 2018 Dec 20;23(1):50-58. doi: 10.1080/19768354.2018.1512522. eCollection 2019 Feb.
6
The roles of non-coding RNAs in cardiac regenerative medicine.非编码RNA在心脏再生医学中的作用。
Noncoding RNA Res. 2017 Jun 7;2(2):100-110. doi: 10.1016/j.ncrna.2017.06.001. eCollection 2017 Jun.
7
A Review on Recent Computational Methods for Predicting Noncoding RNAs.关于预测非编码 RNA 的最新计算方法的综述
Biomed Res Int. 2017;2017:9139504. doi: 10.1155/2017/9139504. Epub 2017 May 3.
8
Genome-wide characterization of intergenic polyadenylation sites redefines gene spaces in Arabidopsis thaliana.基因间多聚腺苷酸化位点的全基因组特征重新定义了拟南芥中的基因空间。
BMC Genomics. 2015 Jul 9;16(1):511. doi: 10.1186/s12864-015-1691-1.
9
GermlncRNA: a unique catalogue of long non-coding RNAs and associated regulations in male germ cell development.生殖系长链非编码RNA:男性生殖细胞发育中长链非编码RNA及其相关调控的独特目录
Database (Oxford). 2015 May 17;2015:bav044. doi: 10.1093/database/bav044. Print 2015.
10
Comparison of splice sites reveals that long noncoding RNAs are evolutionarily well conserved.剪接位点的比较表明,长链非编码RNA在进化上具有良好的保守性。
RNA. 2015 May;21(5):801-12. doi: 10.1261/rna.046342.114. Epub 2015 Mar 23.