Suppr超能文献

通过寡核苷酸组成和可剪接开放阅读框的判别分析预测内部外显子。

Predicting internal exons by oligonucleotide composition and discriminant analysis of spliceable open reading frames.

作者信息

Solovyev V V, Salamov A A, Lawrence C B

机构信息

Department of Cell Biology, Baylor College of Medicine, Houston, TX 77030.

出版信息

Nucleic Acids Res. 1994 Dec 11;22(24):5156-63. doi: 10.1093/nar/22.24.5156.

Abstract

A new method which predicts internal exon sequences in human DNA has been developed. The method is based on a splice site prediction algorithm that uses the linear discriminant function to combine information about significant triplet frequencies of various functional parts of splice site regions and preferences of oligonucleotides in protein coding and intron regions. The accuracy of our splice site recognition function is 97% for donor splice sites and 96% for acceptor splice sites. For exon prediction, we combine in a discriminant function the characteristics describing the 5'-intron region, donor splice site, coding region, acceptor splice site and 3'-intron region for each open reading frame flanked by GT and AG base pairs. The accuracy of precise internal exon recognition on a test set of 451 exon and 246693 pseudoexon sequences is 77% with a specificity of 79%. The recognition quality computed at the level of individual nucleotides is 89% for exon sequences and 98% for intron sequences. This corresponds to a correlation coefficient for exon prediction of 0.87. The precision of this approach is better than other methods and has been tested on a larger data set. We have also developed a means for predicting exon-exon junctions in cDNA sequences, which can be useful for selecting optimal PCR primers.

摘要

一种预测人类DNA内部外显子序列的新方法已经被开发出来。该方法基于一种剪接位点预测算法,该算法使用线性判别函数来组合有关剪接位点区域各个功能部分的重要三联体频率以及蛋白质编码和内含子区域中寡核苷酸偏好的信息。我们的剪接位点识别功能对供体剪接位点的准确率为97%,对受体剪接位点的准确率为96%。对于外显子预测,我们在一个判别函数中组合描述每个由GT和AG碱基对侧翼的开放阅读框的5'-内含子区域、供体剪接位点、编码区域、受体剪接位点和3'-内含子区域的特征。在一个由451个外显子和246693个假外显子序列组成的测试集上,精确内部外显子识别的准确率为77%,特异性为79%。在单个核苷酸水平上计算的识别质量,外显子序列为89%,内含子序列为98%。这对应于外显子预测的相关系数为0.87。这种方法的精度优于其他方法,并且已经在更大的数据集上进行了测试。我们还开发了一种预测cDNA序列中外显子-外显子连接的方法,这对于选择最佳PCR引物可能是有用的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8225/332054/e5988975b050/nar00048-0021-a.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验