• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

WISCOD:一种基于网络的统计工具,用于识别重要的蛋白质编码区域。

WISCOD: a statistical web-enabled tool for the identification of significant protein coding regions.

作者信息

Vilardell Mireia, Parra Genis, Civit Sergi

机构信息

Department of Vertebrate Genomics, Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany.

Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, 04103 Leipzig, Germany.

出版信息

Biomed Res Int. 2014;2014:282343. doi: 10.1155/2014/282343. Epub 2014 Sep 15.

DOI:10.1155/2014/282343
PMID:25313355
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4181902/
Abstract

Classically, gene prediction programs are based on detecting signals such as boundary sites (splice sites, starts, and stops) and coding regions in the DNA sequence in order to build potential exons and join them into a gene structure. Although nowadays it is possible to improve their performance with additional information from related species or/and cDNA databases, further improvement at any step could help to obtain better predictions. Here, we present WISCOD, a web-enabled tool for the identification of significant protein coding regions, a novel software tool that tackles the exon prediction problem in eukaryotic genomes. WISCOD has the capacity to detect real exons from large lists of potential exons, and it provides an easy way to use global P value called expected probability of being a false exon (EPFE) that is useful for ranking potential exons in a probabilistic framework, without additional computational costs. The advantage of our approach is that it significantly increases the specificity and sensitivity (both between 80% and 90%) in comparison to other ab initio methods (where they are in the range of 70-75%). WISCOD is written in JAVA and R and is available to download and to run in a local mode on Linux and Windows platforms.

摘要

传统上,基因预测程序基于检测DNA序列中的边界位点(剪接位点、起始位点和终止位点)及编码区域等信号,以构建潜在外显子并将它们拼接成基因结构。尽管如今利用来自相关物种或/和cDNA数据库的额外信息可以提高其性能,但在任何步骤上的进一步改进都有助于获得更好的预测结果。在此,我们展示了WISCOD,一种用于识别重要蛋白质编码区域的基于网络的工具,这是一种解决真核生物基因组中外显子预测问题的新型软件工具。WISCOD能够从大量潜在外显子列表中检测出真正的外显子,并且它提供了一种简单的方法来使用称为假外显子预期概率(EPFE)的全局P值,该值在概率框架中对潜在外显子进行排序很有用,且无需额外的计算成本。我们方法的优势在于,与其他从头开始的方法相比(后者的特异性和敏感性在70 - 75%范围内),它显著提高了特异性和敏感性(均在80%至90%之间)。WISCOD用JAVA和R编写,可下载并在Linux和Windows平台上以本地模式运行。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78d1/4181902/f316d3130d08/BMRI2014-282343.008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78d1/4181902/0981afa7b0e9/BMRI2014-282343.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78d1/4181902/02c740469d9f/BMRI2014-282343.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78d1/4181902/337a43239aca/BMRI2014-282343.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78d1/4181902/577503afbfdc/BMRI2014-282343.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78d1/4181902/5d96662be3eb/BMRI2014-282343.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78d1/4181902/b35b840b73cf/BMRI2014-282343.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78d1/4181902/cc50a5dc2faa/BMRI2014-282343.007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78d1/4181902/f316d3130d08/BMRI2014-282343.008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78d1/4181902/0981afa7b0e9/BMRI2014-282343.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78d1/4181902/02c740469d9f/BMRI2014-282343.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78d1/4181902/337a43239aca/BMRI2014-282343.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78d1/4181902/577503afbfdc/BMRI2014-282343.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78d1/4181902/5d96662be3eb/BMRI2014-282343.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78d1/4181902/b35b840b73cf/BMRI2014-282343.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78d1/4181902/cc50a5dc2faa/BMRI2014-282343.007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78d1/4181902/f316d3130d08/BMRI2014-282343.008.jpg

相似文献

1
WISCOD: a statistical web-enabled tool for the identification of significant protein coding regions.WISCOD:一种基于网络的统计工具,用于识别重要的蛋白质编码区域。
Biomed Res Int. 2014;2014:282343. doi: 10.1155/2014/282343. Epub 2014 Sep 15.
2
Positional characterisation of false positives from computational prediction of human splice sites.人类剪接位点计算预测中假阳性的位置特征分析
Nucleic Acids Res. 2000 Feb 1;28(3):744-54. doi: 10.1093/nar/28.3.744.
3
EGPred: prediction of eukaryotic genes using ab initio methods after combining with sequence similarity approaches.EGPred:结合序列相似性方法后使用从头算方法预测真核基因。
Genome Res. 2004 Sep;14(9):1756-66. doi: 10.1101/gr.2524704.
4
Ab initio gene finding in Drosophila genomic DNA.在果蝇基因组DNA中进行从头基因预测。
Genome Res. 2000 Apr;10(4):516-22. doi: 10.1101/gr.10.4.516.
5
The prediction of exons through an analysis of spliceable open reading frames.通过对可剪接开放阅读框的分析来预测外显子。
Nucleic Acids Res. 1992 Jul 11;20(13):3453-62. doi: 10.1093/nar/20.13.3453.
6
Predicting internal exons by oligonucleotide composition and discriminant analysis of spliceable open reading frames.通过寡核苷酸组成和可剪接开放阅读框的判别分析预测内部外显子。
Nucleic Acids Res. 1994 Dec 11;22(24):5156-63. doi: 10.1093/nar/22.24.5156.
7
Ancient evolutionary signals of protein-coding sequences allow the discovery of new genes in the Drosophila melanogaster genome.蛋白质编码序列的古老进化信号可用于发现果蝇基因组中的新基因。
BMC Genomics. 2020 Mar 5;21(1):210. doi: 10.1186/s12864-020-6632-y.
8
NemaFootPrinter: a web based software for the identification of conserved non-coding genome sequence regions between C. elegans and C. briggsae.线虫足部打印机:一种基于网络的软件,用于识别秀丽隐杆线虫和briggsae线虫之间保守的非编码基因组序列区域。
BMC Bioinformatics. 2005 Dec 1;6 Suppl 4(Suppl 4):S22. doi: 10.1186/1471-2105-6-S4-S22.
9
uPEPperoni: an online tool for upstream open reading frame location and analysis of transcript conservation.uPEPperoni:一个用于上游开放阅读框定位和转录保守性分析的在线工具。
BMC Bioinformatics. 2014 Feb 1;15:36. doi: 10.1186/1471-2105-15-36.
10
Predicting mutually exclusive spliced exons based on exon length, splice site and reading frame conservation, and exon sequence homology.基于外显子长度、剪接位点和阅读框保守性以及外显子序列同源性预测相互排斥的剪接外显子。
BMC Bioinformatics. 2011 Jun 30;12:270. doi: 10.1186/1471-2105-12-270.

本文引用的文献

1
Database resources of the National Center for Biotechnology Information.美国国立生物技术信息中心的数据库资源。
Nucleic Acids Res. 2011 Jan;39(Database issue):D38-51. doi: 10.1093/nar/gkq1172. Epub 2010 Nov 21.
2
Ensembl's 10th year.Ensembl 的第十个年头。
Nucleic Acids Res. 2010 Jan;38(Database issue):D557-62. doi: 10.1093/nar/gkp972. Epub 2009 Nov 11.
3
Multiple isoforms of PAX5 are expressed in both lymphomas and normal B-cells.PAX5 在淋巴瘤和正常 B 细胞中均有表达,其具有多种异构体。
Br J Haematol. 2009 Nov;147(3):328-38. doi: 10.1111/j.1365-2141.2009.07859.x. Epub 2009 Sep 1.
4
Identifying protein-coding genes in genomic sequences.在基因组序列中识别蛋白质编码基因。
Genome Biol. 2009;10(1):201. doi: 10.1186/gb-2009-10-1-201. Epub 2009 Jan 30.
5
Patterns of exon-intron architecture variation of genes in eukaryotic genomes.真核生物基因组中基因的外显子-内含子结构变异模式。
BMC Genomics. 2009 Jan 24;10:47. doi: 10.1186/1471-2164-10-47.
6
Scipio: using protein sequences to determine the precise exon/intron structures of genes and their orthologs in closely related species.西庇阿:利用蛋白质序列确定基因及其在近缘物种中的直系同源基因的精确外显子/内含子结构。
BMC Bioinformatics. 2008 Jun 13;9:278. doi: 10.1186/1471-2105-9-278.
7
Functional and evolutionary analysis of alternatively spliced genes is consistent with an early eukaryotic origin of alternative splicing.可变剪接基因的功能和进化分析与可变剪接的早期真核生物起源相一致。
BMC Evol Biol. 2007 Oct 4;7:188. doi: 10.1186/1471-2148-7-188.
8
diArk--a resource for eukaryotic genome research.diArk——真核生物基因组研究资源。
BMC Genomics. 2007 Apr 17;8:103. doi: 10.1186/1471-2164-8-103.
9
Hypothesis testing approaches to the exon prediction problem.外显子预测问题的假设检验方法。
Bioinformatics. 2006 Dec 15;22(24):3003-8. doi: 10.1093/bioinformatics/btl544. Epub 2006 Oct 25.
10
GENCODE: producing a reference annotation for ENCODE.GENCODE:为ENCODE生成参考注释。
Genome Biol. 2006;7 Suppl 1(Suppl 1):S4.1-9. doi: 10.1186/gb-2006-7-s1-s4. Epub 2006 Aug 7.