• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

A statistical analytical approach to decipher information from biological sequences: application to murine splice-site analysis and prediction.

作者信息

Reddy B V, Pandit M W

机构信息

Centre for Cellular and Molecular Biology, Hyderabad, India.

出版信息

J Biomol Struct Dyn. 1995 Feb;12(4):785-801. doi: 10.1080/07391102.1995.10508776.

DOI:10.1080/07391102.1995.10508776
PMID:7779300
Abstract

A simple statistical approach for the analysis of biological sequences, such as splice-sites, promoter regions, helices and extended structure forming regions or any other sequence dependent functional entities in proteins, is presented. The approach has been proved useful to develop a method for prediction of such entities in newly available sequences. We first search for invariant sequence features of each functional entity from the experimentally available sequences and identify a set of 'like' sequences with similar sequence features. In the next step, concrete features of sequence entities in terms of occurrences of smaller subsequences are identified at various positions which are used as a knowledge base to select potential functional entities from the identified 'like' sequences. The third step consists of refinement of this pattern learning, statistical improvements of the knowledge base weight matrices, and finally its application to predict functional entities in newly available sequences. Such an analysis is operationally described for murine splice-site predictions. Regions comprising -30 to +30 nucleotides from the splice-junction at the murine splice-sites (donors and acceptors), reported earlier, were analyzed. Invariant sequence-specific features in terms of monomer frequency average were used to identify splice-site-like sequences in the EMBL murine DNA sequence data base. The frequencies of occurrence of mono-, di-, tri- and tetranucleotides in the known splice-sites were studied in comparison with the splice-site-like sequences; the significant differences in their occurrences were extracted as statistical knowledge coded in weight matrices for computer to identify potential splice-sites. The algorithm was refined and a method was developed to predict potential splice-sites in a given murine DNA; the analysis was also extended to human DNA. The success rate of the method to predict correct splice-sites in these species is found to be 80% and 85%, respectively. The major strength of this method lies in reducing significantly the number of false positives which are normally picked up in such analysis.

摘要

相似文献

1
A statistical analytical approach to decipher information from biological sequences: application to murine splice-site analysis and prediction.
J Biomol Struct Dyn. 1995 Feb;12(4):785-801. doi: 10.1080/07391102.1995.10508776.
2
Classification of splice-junction sequences via weighted position specific scoring approach.通过加权位置特异性评分方法对剪接接头序列进行分类。
Comput Biol Chem. 2010 Dec;34(5-6):293-9. doi: 10.1016/j.compbiolchem.2010.10.003. Epub 2010 Oct 14.
3
Genomic splice site prediction algorithm based on nucleotide sequence pattern for RNA viruses.基于核苷酸序列模式的RNA病毒基因组剪接位点预测算法
Comput Biol Chem. 2009 Apr;33(2):171-5. doi: 10.1016/j.compbiolchem.2008.08.002. Epub 2008 Aug 20.
4
Revisiting the relationship between compositional sequence complexity and periodicity.重新审视组成序列复杂性与周期性之间的关系。
Comput Biol Chem. 2008 Feb;32(1):17-28. doi: 10.1016/j.compbiolchem.2007.09.001. Epub 2007 Sep 12.
5
FunSiP: a modular and extensible classifier for the prediction of functional sites in DNA.FunSiP:一种用于预测DNA功能位点的模块化可扩展分类器。
Bioinformatics. 2008 Jul 1;24(13):1532-3. doi: 10.1093/bioinformatics/btn225. Epub 2008 May 12.
6
[Analysis, identification and correction of some errors of model refseqs appeared in NCBI Human Gene Database by in silico cloning and experimental verification of novel human genes].[通过新型人类基因的电子克隆和实验验证对NCBI人类基因数据库中出现的模型参考序列的一些错误进行分析、鉴定和校正]
Yi Chuan Xue Bao. 2004 May;31(5):431-43.
7
Improvement of TRANSFAC matrices using multiple local alignment of transcription factor binding site sequences.利用转录因子结合位点序列的多重局部比对改进TRANSFAC矩阵。
Genome Inform. 2005;16(1):68-72.
8
Information content of individual genetic sequences.单个基因序列的信息内容。
J Theor Biol. 1997 Dec 21;189(4):427-41. doi: 10.1006/jtbi.1997.0540.
9
One parameter to describe the mechanism of splice sites competition.一个描述剪接位点竞争机制的参数。
Biochem Biophys Res Commun. 2008 Apr 4;368(2):379-81. doi: 10.1016/j.bbrc.2008.01.089. Epub 2008 Jan 28.
10
A simple method to predict protein-binding from aligned sequences--application to MHC superfamily and beta2-microglobulin.一种从比对序列预测蛋白质结合的简单方法——应用于MHC超家族和β2-微球蛋白
Bioinformatics. 2006 Feb 15;22(4):453-9. doi: 10.1093/bioinformatics/bti826. Epub 2005 Dec 13.