• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

DNA序列与蛋白质序列的比较。

Comparison of DNA sequences with protein sequences.

作者信息

Pearson W R, Wood T, Zhang Z, Miller W

机构信息

Department of Biochemistry, University of Virginia, Charlottesville 22908, USA.

出版信息

Genomics. 1997 Nov 15;46(1):24-36. doi: 10.1006/geno.1997.4995.

DOI:10.1006/geno.1997.4995
PMID:9403055
Abstract

The FASTA package of sequence comparison programs has been expanded to include FASTX and FASTY, which compare a DNA sequence to a protein sequence database, translating the DNA sequence in three frames and aligning the translated DNA sequence to each sequence in the protein database, allowing gaps and frameshifts. Also new are TFASTX and TFASTY, which compare a protein sequence to a DNA sequence database, translating each sequence in the DNA database in six frames and scoring alignments with gaps and frameshifts. FASTX and TFASTX allow only frameshifts between codons, while FASTY and TFASTY allow substitutions or frameshifts within a codon. We examined the performance of FASTX and FASTY using different gap-opening, gap-extension, frameshift, and nucleotide substitution penalties. In general, FASTX and FASTY perform equivalently when query sequences contain 0-10% errors. We also evaluated the statistical estimates reported by FASTX and FASTY. These estimates are quite accurate, except when an out-of-frame translation produces a low-complexity protein sequence. We used FASTX to scan the Mycoplasma genitalium, Haemophilus influenzae, and Methanococcus jannaschii genomes for unidentified or misidentified protein-coding genes. We found at least 9 new protein-coding genes in the three genomes and at least 35 genes with potentially incorrect boundaries.

摘要

序列比较程序的FASTA软件包已得到扩展,纳入了FASTX和FASTY,它们将DNA序列与蛋白质序列数据库进行比较,以三种阅读框翻译DNA序列,并将翻译后的DNA序列与蛋白质数据库中的每个序列进行比对,允许出现空位和移码。同样新增的是TFASTX和TFASTY,它们将蛋白质序列与DNA序列数据库进行比较,以六种阅读框翻译DNA数据库中的每个序列,并对有空位和移码的比对进行评分。FASTX和TFASTX只允许密码子之间的移码,而FASTY和TFASTY允许密码子内的替换或移码。我们使用不同的空位开放、空位延伸、移码和核苷酸替换罚分来检验FASTX和FASTY的性能。一般来说,当查询序列包含0 - 10%的错误时,FASTX和FASTY的表现相当。我们还评估了FASTX和FASTY报告的统计估计值。这些估计值相当准确,除非框外翻译产生低复杂性的蛋白质序列。我们使用FASTX扫描生殖支原体、流感嗜血杆菌和詹氏甲烷球菌的基因组,以寻找未识别或错误识别的蛋白质编码基因。我们在这三个基因组中发现了至少9个新的蛋白质编码基因以及至少35个边界可能有误的基因。

相似文献

1
Comparison of DNA sequences with protein sequences.DNA序列与蛋白质序列的比较。
Genomics. 1997 Nov 15;46(1):24-36. doi: 10.1006/geno.1997.4995.
2
[Analysis, identification and correction of some errors of model refseqs appeared in NCBI Human Gene Database by in silico cloning and experimental verification of novel human genes].[通过新型人类基因的电子克隆和实验验证对NCBI人类基因数据库中出现的模型参考序列的一些错误进行分析、鉴定和校正]
Yi Chuan Xue Bao. 2004 May;31(5):431-43.
3
transAlign: using amino acids to facilitate the multiple alignment of protein-coding DNA sequences.transAlign:利用氨基酸促进蛋白质编码DNA序列的多重比对。
BMC Bioinformatics. 2005 Jun 22;6:156. doi: 10.1186/1471-2105-6-156.
4
A tool for analyzing and annotating genomic sequences.一种用于分析和注释基因组序列的工具。
Genomics. 1997 Nov 15;46(1):37-45. doi: 10.1006/geno.1997.4984.
5
Detecting frame shifts by amino acid sequence comparison.通过氨基酸序列比较检测移码突变。
J Mol Biol. 1993 Dec 20;234(4):1140-57. doi: 10.1006/jmbi.1993.1666.
6
Using the FASTA program to search protein and DNA sequence databases.使用FASTA程序搜索蛋白质和DNA序列数据库。
Methods Mol Biol. 1994;25:365-89. doi: 10.1385/0-89603-276-0:365.
7
Lessons from sequenced genomes. Overlapping genes in Methanococcus jannaschii?来自已测序基因组的经验教训。詹氏甲烷球菌中的重叠基因?
IUBMB Life. 2000 Feb;49(2):121-3. doi: 10.1080/15216540050022430.
8
Empirical statistical estimates for sequence similarity searches.序列相似性搜索的经验性统计估计。
J Mol Biol. 1998 Feb 13;276(1):71-84. doi: 10.1006/jmbi.1997.1525.
9
STORM towards protein function: systematic tailored ORF-data retrieval and management.面向蛋白质功能的STORM:系统定制的开放阅读框数据检索与管理
Appl Bioinformatics. 2003;2(3):177-9.
10
Evaluation of algorithms used for cross-species proteome characterisation.用于跨物种蛋白质组表征的算法评估。
Electrophoresis. 1997 Aug;18(8):1410-7. doi: 10.1002/elps.1150180816.

引用本文的文献

1
GIN-CRC-Pareto: A graph-based Pareto-optimal multi-task learning framework to identify miRNA-target interactions in colorectal cancer.GIN-CRC-Pareto:一种基于图的帕累托最优多任务学习框架,用于识别结直肠癌中的miRNA-靶标相互作用。
bioRxiv. 2025 Aug 12:2025.08.10.669528. doi: 10.1101/2025.08.10.669528.
2
Comparative genomics of .……的比较基因组学
J Bacteriol. 2025 Aug 21;207(8):e0014925. doi: 10.1128/jb.00149-25. Epub 2025 Jul 25.
3
De novo gene birth and the conundrum of ORFan genes in bacteria.细菌中的从头基因诞生与孤儿基因难题
Genome Res. 2025 Aug 1;35(8):1679-1688. doi: 10.1101/gr.280157.124.
4
An insight into the draft genome of the Oriental rat flea, Xenopsylla cheopis, together with its Wolbachia endosymbiont.对东方鼠蚤(印鼠客蚤)及其沃尔巴克氏体共生菌的基因组草图的深入研究。
BMC Genomics. 2025 Jul 1;26(1):621. doi: 10.1186/s12864-025-11759-8.
5
Optimised Ribosome Profiling Reveals New Insights Into Translational Regulation in Synchronised Chlamydomonas reinhardtii Cultures.优化的核糖体分析揭示了莱茵衣藻同步培养物中翻译调控的新见解。
Plant Cell Environ. 2025 Sep;48(9):6982-7000. doi: 10.1111/pce.15681. Epub 2025 Jun 11.
6
Community differences and potential function along the particle size spectrum of microbes in the twilight zone.海洋中层带微生物粒径谱的群落差异及潜在功能
Microbiome. 2025 May 14;13(1):121. doi: 10.1186/s40168-025-02116-8.
7
Functionally characterizing obesity-susceptibility genes using CRISPR/Cas9, in vivo imaging and deep learning.利用CRISPR/Cas9、体内成像和深度学习对肥胖易感性基因进行功能表征。
Sci Rep. 2025 Feb 13;15(1):5408. doi: 10.1038/s41598-025-89823-2.
8
Co-translational protein aggregation and ribosome stalling as a broad-spectrum antibacterial mechanism.共翻译蛋白质聚集和核糖体停滞作为一种广谱抗菌机制。
Nat Commun. 2025 Feb 12;16(1):1561. doi: 10.1038/s41467-025-56873-z.
9
Identification and Analysis of Circular RNAs in Mammary Gland from Yaks Between Lactation and Dry Period.牦牛泌乳期和干奶期乳腺组织中环状RNA的鉴定与分析
Animals (Basel). 2025 Jan 3;15(1):89. doi: 10.3390/ani15010089.
10
Gra-CRC-miRTar: The pre-trained nucleotide-to-graph neural networks to identify potential miRNA targets in colorectal cancer.Gra-CRC-miRTar:用于识别结直肠癌中潜在miRNA靶点的预训练核苷酸到图形神经网络。
Comput Struct Biotechnol J. 2024 Jul 18;23:3020-3029. doi: 10.1016/j.csbj.2024.07.014. eCollection 2024 Dec.