• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过寡核苷酸组成和可剪接开放阅读框的判别分析预测人类外显子

The prediction of human exons by oligonucleotide composition and discriminant analysis of spliceable open reading frames.

作者信息

Solovyev V V, Salamov A A, Lawrence C B

机构信息

Department of Cell Biology, Baylor College of Medicine, Houston, TX 77030, USA.

出版信息

Proc Int Conf Intell Syst Mol Biol. 1994;2:354-62.

PMID:7584412
Abstract

Discriminant analysis is applied to the problem of recognition 5'-, internal and 3'-exons in human DNA sequences. Specific recognition functions were developed for revealing exons of particular types. The method based on a splice site prediction algorithm that uses the linear Fisher discriminant to combine the information about significant triplet frequencies of various functional parts of splice site regions and preferences of oligonucleotides in protein coding and intron regions (Solovyev, Lawrence, 1994). The accuracy of our splice site recognition function is about 97%. A discriminant function for 5'-exon prediction includes hexanucleotide composition of upstream region, triplet composition around the ATG codon, ORF coding potential, donor splice site potential and composition of downstream intron region. For internal exon prediction, we combine in a discriminant function the characteristics describing the 5'-intron region, donor splice site, coding region, acceptor splice site and 3'-intron region for each open reading frame flanked by GT and AG base pairs. The accuracy of precise internal exon recognition on a test set of 451 exon and 246693 pseudoexon sequences is 77% with a specificity of 79% and a level of pseudoexon ORF prediction of 99.96%. The recognition quality computed at the level of individual nucleotides is 89% for exon sequences and 98% for intron sequences. A discriminant function for 3'-exon prediction includes octanucleotide composition of upstream intron region, triplet composition around the stop codon, ORF coding potential, acceptor splice site potential and hexanucleotide composition of downstream region.(ABSTRACT TRUNCATED AT 250 WORDS)

摘要

判别分析应用于识别人类DNA序列中5'端、内部和3'端外显子的问题。开发了特定的识别函数以揭示特定类型的外显子。该方法基于一种剪接位点预测算法,该算法使用线性Fisher判别式来组合有关剪接位点区域各个功能部分的重要三联体频率以及蛋白质编码和内含子区域中寡核苷酸偏好的信息(索洛维耶夫、劳伦斯,1994年)。我们的剪接位点识别函数的准确率约为97%。用于5'端外显子预测的判别函数包括上游区域的六核苷酸组成、ATG密码子周围的三联体组成、开放阅读框编码潜力、供体剪接位点潜力和下游内含子区域的组成。对于内部外显子预测,我们在一个判别函数中结合了描述每个由GT和AG碱基对侧翼的开放阅读框的5'端内含子区域、供体剪接位点、编码区域、受体剪接位点和3'端内含子区域的特征。在一个由451个外显子和246693个假外显子序列组成的测试集上,精确识别内部外显子的准确率为77%,特异性为79%,假外显子开放阅读框预测水平为99.96%。在单个核苷酸水平上计算的外显子序列识别质量为89%,内含子序列为98%。用于3'端外显子预测的判别函数包括上游内含子区域的八核苷酸组成、终止密码子周围的三联体组成、开放阅读框编码潜力、受体剪接位点潜力和下游区域的六核苷酸组成。(摘要截至于250字)

相似文献

1
The prediction of human exons by oligonucleotide composition and discriminant analysis of spliceable open reading frames.通过寡核苷酸组成和可剪接开放阅读框的判别分析预测人类外显子
Proc Int Conf Intell Syst Mol Biol. 1994;2:354-62.
2
Predicting internal exons by oligonucleotide composition and discriminant analysis of spliceable open reading frames.通过寡核苷酸组成和可剪接开放阅读框的判别分析预测内部外显子。
Nucleic Acids Res. 1994 Dec 11;22(24):5156-63. doi: 10.1093/nar/22.24.5156.
3
Identification of human gene functional regions based on oligonucleotide composition.基于寡核苷酸组成鉴定人类基因功能区域
Proc Int Conf Intell Syst Mol Biol. 1993;1:371-9.
4
Identification of human gene structure using linear discriminant functions and dynamic programming.使用线性判别函数和动态规划识别人类基因结构。
Proc Int Conf Intell Syst Mol Biol. 1995;3:367-75.
5
The prediction of exons through an analysis of spliceable open reading frames.通过对可剪接开放阅读框的分析来预测外显子。
Nucleic Acids Res. 1992 Jul 11;20(13):3453-62. doi: 10.1093/nar/20.13.3453.
6
[Analysis, identification and correction of some errors of model refseqs appeared in NCBI Human Gene Database by in silico cloning and experimental verification of novel human genes].[通过新型人类基因的电子克隆和实验验证对NCBI人类基因数据库中出现的模型参考序列的一些错误进行分析、鉴定和校正]
Yi Chuan Xue Bao. 2004 May;31(5):431-43.
7
Recognizing exons in genomic sequence using GRAIL II.使用GRAIL II在基因组序列中识别外显子。
Genet Eng (N Y). 1994;16:241-53.
8
Determination of eukaryotic protein coding regions using neural networks and information theory.使用神经网络和信息论确定真核生物蛋白质编码区域
J Mol Biol. 1992 Jul 20;226(2):471-9. doi: 10.1016/0022-2836(92)90961-i.
9
Characterization of hprt splicing mutations induced by the ultimate carcinogenic metabolite of benzo[a]pyrene in Chinese hamster V-79 cells.苯并[a]芘的最终致癌代谢产物在中国仓鼠V-79细胞中诱导的hprt剪接突变的特征分析。
Cancer Res. 1995 Apr 1;55(7):1550-8.
10
Classification of splice-junction sequences via weighted position specific scoring approach.通过加权位置特异性评分方法对剪接接头序列进行分类。
Comput Biol Chem. 2010 Dec;34(5-6):293-9. doi: 10.1016/j.compbiolchem.2010.10.003. Epub 2010 Oct 14.

引用本文的文献

1
Architecture and Distribution of Introns in Core Genes of Four Species.四种物种核心基因中内含子的结构与分布
G3 (Bethesda). 2017 Nov 6;7(11):3809-3820. doi: 10.1534/g3.117.300344.
2
A beginner's guide to eukaryotic genome annotation.真核生物基因组注释入门指南。
Nat Rev Genet. 2012 Apr 18;13(5):329-42. doi: 10.1038/nrg3174.
3
Peptide vocabulary analysis reveals ultra-conservation and homonymity in protein sequences.肽词汇分析揭示了蛋白质序列中的超保守性和同音性。
Bioinform Biol Insights. 2009 Nov 24;1:101-26. doi: 10.4137/bbi.s415.
4
Multiple splicing defects in an intronic false exon.一个内含子假外显子中的多个剪接缺陷。
Mol Cell Biol. 2000 Sep;20(17):6414-25. doi: 10.1128/MCB.20.17.6414-6425.2000.