• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过数据库相似性搜索鉴定蛋白质编码区域。

Identification of protein coding regions by database similarity search.

作者信息

Gish W, States D J

机构信息

National Center for Biotechnology Information, National Library of Medicine, Bethesda, Maryland 20894-0001.

出版信息

Nat Genet. 1993 Mar;3(3):266-72. doi: 10.1038/ng0393-266.

DOI:10.1038/ng0393-266
PMID:8485583
Abstract

Sequence similarity between a translated nucleotide sequence and a known biological protein can provide strong evidence for the presence of a homologous coding region, even between distantly related genes. The computer program BLASTX performed conceptual translation of a nucleotide query sequence followed by a protein database search in one programmatic step. We characterized the sensitivity of BLASTX recognition to the presence of substitution, insertion and deletion errors in the query sequence and to sequence divergence. Reading frames were reliably identified in the presence of 1% query errors, a rate that is typical for primary sequence data. BLASTX is appropriate for use in moderate and large scale sequencing projects at the earliest opportunity, when the data are most prone to containing errors.

摘要

翻译后的核苷酸序列与已知生物蛋白质之间的序列相似性,即使在远缘相关基因之间,也能为同源编码区域的存在提供有力证据。计算机程序BLASTX在一个编程步骤中对核苷酸查询序列进行概念性翻译,然后在蛋白质数据库中进行搜索。我们对BLASTX识别查询序列中替换、插入和缺失错误以及序列差异的敏感性进行了表征。在存在1%查询错误的情况下能够可靠地识别阅读框,这一错误率在原始序列数据中很常见。BLASTX适用于在最早阶段、数据最容易包含错误时的中大规模测序项目。

相似文献

1
Identification of protein coding regions by database similarity search.通过数据库相似性搜索鉴定蛋白质编码区域。
Nat Genet. 1993 Mar;3(3):266-72. doi: 10.1038/ng0393-266.
2
Effective protein sequence comparison.有效的蛋白质序列比较。
Methods Enzymol. 1996;266:227-58. doi: 10.1016/s0076-6879(96)66017-0.
3
Combined use of sequence similarity and codon bias for coding region identification.结合序列相似性和密码子偏好性进行编码区识别。
J Comput Biol. 1994 Spring;1(1):39-50. doi: 10.1089/cmb.1994.1.39.
4
Finding errors in DNA sequences.寻找DNA序列中的错误。
Proc Natl Acad Sci U S A. 1992 May 15;89(10):4698-702. doi: 10.1073/pnas.89.10.4698.
5
Iterative sequence/secondary structure search for protein homologs: comparison with amino acid sequence alignments and application to fold recognition in genome databases.用于蛋白质同源物的迭代序列/二级结构搜索:与氨基酸序列比对的比较及在基因组数据库中折叠识别的应用
Bioinformatics. 2000 Nov;16(11):988-1002. doi: 10.1093/bioinformatics/16.11.988.
6
OrfPredictor: predicting protein-coding regions in EST-derived sequences.OrfPredictor:预测EST衍生序列中的蛋白质编码区域。
Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W677-80. doi: 10.1093/nar/gki394.
7
Cell cycle and apoptosis: possible roles of Gadd45 and MyD118 proteins inferred from their homology to ribosomal proteins.细胞周期与细胞凋亡:从Gadd45和MyD118蛋白与核糖体蛋白的同源性推断出的可能作用
J Mol Med (Berl). 1997 Apr;75(4):236-8.
8
Finding Homologs in Amino Acid Sequences Using Network BLAST Searches.使用网络BLAST搜索在氨基酸序列中寻找同源物。
Curr Protoc Bioinformatics. 2017 Sep 13;59:3.4.1-3.4.24. doi: 10.1002/cpbi.34.
9
Finding homologs to nucleic acid or protein sequences using the framesearch program.使用framesearch程序查找核酸或蛋白质序列的同源物。
Curr Protoc Bioinformatics. 2002 Aug;Chapter 3:Unit 3.2. doi: 10.1002/0471250953.bi0302s00.
10
Frame: detection of genomic sequencing errors.框架:基因组测序错误的检测
Bioinformatics. 1998;14(4):367-71. doi: 10.1093/bioinformatics/14.4.367.

引用本文的文献

1
Characteristics and phylogenetic analysis of the complete chloroplast genome of Makino 1901 from the family Loganiaceae.马钱科1901年牧野氏植物完整叶绿体基因组的特征及系统发育分析
Mitochondrial DNA B Resour. 2025 Aug 21;10(9):847-851. doi: 10.1080/23802359.2025.2547917. eCollection 2025.
2
Analysis wheat wild relatives Thinopyrum intermedium and Roegneria kamoji genomes reveal different polyploid evolution paths.对小麦野生近缘种中间偃麦草和鹅观草基因组的分析揭示了不同的多倍体进化路径。
Nat Commun. 2025 Aug 18;16(1):7693. doi: 10.1038/s41467-025-63007-y.
3
Multi-trait GWAS for growth under contrasting thermal rearing conditions in rainbow trout (Oncorhynchus mykiss).
虹鳟(Oncorhynchus mykiss)在不同热饲养条件下生长的多性状全基因组关联研究。
Mol Genet Genomics. 2025 Aug 11;300(1):75. doi: 10.1007/s00438-025-02263-5.
4
Plastome phylogenomics of the diverse neotropical orchid genus Lepanthes with emphasis on subgenus Marsipanthes (Pleurothallidinae: Orchidaceae).新热带地区多样化的 Lepanthes 属兰花的质体基因组系统发育学研究,重点关注 Marsipanthes 亚属(侧萼兰亚族:兰科)。
BMC Ecol Evol. 2025 Aug 7;25(1):79. doi: 10.1186/s12862-025-02396-6.
5
Decreases in chimpanzee respiratory disease signs and enteric viral quantity following implementation of anthroponotic disease prevention protocols at a long-term research site.在一个长期研究地点实施人畜共患病预防方案后,黑猩猩呼吸道疾病症状和肠道病毒数量减少。
Biol Conserv. 2025 Aug;308. doi: 10.1016/j.biocon.2025.111225. Epub 2025 May 16.
6
Characterization of the Complete Mitochondrial Genome of the Red Alga (Rhodophyta, Gigartinales, Phyllophoraceae) and Its Phylogenetic Analysis.红藻(红藻门,杉藻目,叶状藻科)线粒体全基因组的特征分析及其系统发育分析
Biology (Basel). 2025 May 30;14(6):638. doi: 10.3390/biology14060638.
7
The great phage escape: Activating and escaping lactococcal antiphage systems.噬菌体的巨大逃逸:激活并逃离乳球菌抗噬菌体系统
Proc Natl Acad Sci U S A. 2025 Jun 17;122(24):e2426508122. doi: 10.1073/pnas.2426508122. Epub 2025 Jun 11.
8
Whole genome sequencing and molecular detection of potato virus X in Bangladesh.孟加拉国马铃薯X病毒的全基因组测序与分子检测
PLoS One. 2025 May 8;20(5):e0322935. doi: 10.1371/journal.pone.0322935. eCollection 2025.
9
Driven Multi-Epitope Subunit Candidate Vaccine against Bovine Tuberculosis.针对牛结核病的驱动多表位亚单位候选疫苗。
Transbound Emerg Dis. 2024 Sep 4;2024:5534041. doi: 10.1155/2024/5534041. eCollection 2024.
10
Complete Mitochondrial Genome and Phylogenetic Analysis of the Red Algae Chondracanthus tenellus (Rhodophyta, Gigartinales) from South Korea.来自韩国的红藻细弱角叉菜(红藻门,杉藻目)的线粒体全基因组及系统发育分析
Biochem Genet. 2025 Feb 25. doi: 10.1007/s10528-025-11063-w.