• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

构建经过验证的、无冗余的复合蛋白质序列数据库。

Construction of validated, non-redundant composite protein sequence databases.

作者信息

Bleasby A J, Wootton J C

机构信息

Departments of Genetics and Biophysics, University of Leeds, UK.

出版信息

Protein Eng. 1990 Jan;3(3):153-9. doi: 10.1093/protein/3.3.153.

DOI:10.1093/protein/3.3.153
PMID:2330366
Abstract

A strategy has been developed for the construction of a validated, comprehensive composite protein sequence database. Entries are amalgamated from primary source data bases by a largely automated set of processes in which redundant and trivially different entries are eliminated. A modular approach has been adopted to allow scientific judgement to be used at each stage of database processing and amalgamation. Source databases are assigned a priority depending on the quality of sequence validation and commenting. Rejection of entries from the lower priority database, in each pairwise comparison of databases, is carried out according to optionally defined redundancy criteria based on sequence segment mismatches. Efficient algorithms for this methodology are embodied in the COMPO software system. COMPO has been applied for over 2 years in construction and regular updating of the OWL composite protein sequence database from the source databases NBRF-PIR, SWISS-PROT, a GenBank translation retrieved from the feature tables, NBRF-NEW, NEWAT86, PSD-KYOTO and the sequences contained in the Brookhaven protein structure databank. OWL is part of the ISIS integrated data resource of protein sequence and structure [Akrigg et al. (1988) Nature, 335, 745-746]. The modular nature of the integration process greatly facilitates the frequent updating of OWL following releases of the source databases. The extent of redundancy in these sources is revealed by the comparison process. The advantages of a robust composite database for sequence similarity searching and information retrieval are discussed.

摘要

已开发出一种策略,用于构建一个经过验证的、全面的复合蛋白质序列数据库。通过一套基本自动化的流程,将来自原始数据库的条目进行合并,在此过程中消除冗余和差异极小的条目。采用了模块化方法,以便在数据库处理和合并的每个阶段都能运用科学判断。根据序列验证和注释的质量,为源数据库分配优先级。在数据库的每一次两两比较中,根据基于序列片段错配的可选定义冗余标准,拒绝来自低优先级数据库的条目。该方法的高效算法体现在COMPO软件系统中。COMPO已应用两年多,用于从源数据库NBRF-PIR、SWISS-PROT、从特征表中检索到的GenBank翻译、NBRF-NEW、NEWAT86、PSD-KYOTO以及布鲁克海文蛋白质结构数据库中包含的序列构建和定期更新OWL复合蛋白质序列数据库。OWL是蛋白质序列和结构的ISIS集成数据资源的一部分[Akrigg等人(1988年),《自然》,335卷,745 - 746页]。集成过程的模块化性质极大地促进了源数据库发布后OWL的频繁更新。比较过程揭示了这些来源中的冗余程度。讨论了一个强大的复合数据库在序列相似性搜索和信息检索方面的优势。

相似文献

1
Construction of validated, non-redundant composite protein sequence databases.构建经过验证的、无冗余的复合蛋白质序列数据库。
Protein Eng. 1990 Jan;3(3):153-9. doi: 10.1093/protein/3.3.153.
2
OWL--a non-redundant composite protein sequence database.OWL——一个非冗余复合蛋白质序列数据库。
Nucleic Acids Res. 1994 Sep;22(17):3574-7.
3
The Protein Information Resource: an integrated public resource of functional annotation of proteins.蛋白质信息资源:蛋白质功能注释的综合公共资源。
Nucleic Acids Res. 2002 Jan 1;30(1):35-7. doi: 10.1093/nar/30.1.35.
4
The Protein Information Resource.蛋白质信息资源
Nucleic Acids Res. 2003 Jan 1;31(1):345-7. doi: 10.1093/nar/gkg040.
5
The PIR-International Protein Sequence Database.PIR国际蛋白质序列数据库。
Nucleic Acids Res. 1999 Jan 1;27(1):39-43. doi: 10.1093/nar/27.1.39.
6
[Analysis, identification and correction of some errors of model refseqs appeared in NCBI Human Gene Database by in silico cloning and experimental verification of novel human genes].[通过新型人类基因的电子克隆和实验验证对NCBI人类基因数据库中出现的模型参考序列的一些错误进行分析、鉴定和校正]
Yi Chuan Xue Bao. 2004 May;31(5):431-43.
7
ProClass Protein Family Database.专业蛋白质家族数据库
Nucleic Acids Res. 1999 Jan 1;27(1):272-4. doi: 10.1093/nar/27.1.272.
8
Protein sequence annotation in the genome era: the annotation concept of SWISS-PROT+TREMBL.基因组时代的蛋白质序列注释:SWISS-PROT+TREMBL注释概念
Proc Int Conf Intell Syst Mol Biol. 1997;5:33-43.
9
A cross-reference table between the Protein Data Bank of macromolecular structures and the National Biomedical Research Foundation-Protein Identification Resource amino acid sequence data bank.大分子结构蛋白质数据库与国家生物医学研究基金会-蛋白质鉴定资源氨基酸序列数据库之间的交叉索引表。
Protein Seq Data Anal. 1989 Jul;2(4):295-308.
10
Effective protein sequence comparison.有效的蛋白质序列比较。
Methods Enzymol. 1996;266:227-58. doi: 10.1016/s0076-6879(96)66017-0.

引用本文的文献

1
Transcriptome analysis identifies genes regulating self-compatibility, flowering time, and oil biosynthesis in Noug (Guizotia abyssinica).转录组分析鉴定出调控诺格(Guizotia abyssinica)自交亲和性、开花时间和油脂生物合成的基因。
Sci Rep. 2025 Sep 12;15(1):32475. doi: 10.1038/s41598-025-18728-x.
2
The first two complete mitochondrial genomes of (Hai B. Li & Hai L. Wei) D. Arora & J.L. Frank 2014 (Boletales: Boletaceae) and phylogenetic analysis.(海B. 李 & 海L. 魏)D. 阿罗拉 & J.L. 弗兰克2014年发表的牛肝菌目牛肝菌科的前两个完整线粒体基因组及系统发育分析
Mitochondrial DNA B Resour. 2025 Jun 27;10(7):646-651. doi: 10.1080/23802359.2025.2511153. eCollection 2025.
3
Characterization and phylogenetic analysis of the complete mitochondrial genome of (Berk. & M.A. Curtis) Gilb. & Ryvarden, 1985 (Polyporales: Fomitopsidaceae).
(伯克和M.A. 柯蒂斯)吉尔布和里瓦尔登,1985年(多孔菌目:拟层孔菌科)线粒体全基因组的特征分析与系统发育分析
Mitochondrial DNA B Resour. 2025 May 25;10(6):532-536. doi: 10.1080/23802359.2025.2509806. eCollection 2025.
4
The six whole mitochondrial genomes for the species: features, evolution and phylogeny.该物种的六个完整线粒体基因组:特征、进化与系统发育
IMA Fungus. 2025 Feb 28;16:e140572. doi: 10.3897/imafungus.16.140572. eCollection 2025.
5
Mitochondrial Genomes from Fungal the Entomopathogenic Genus Reveals Evolutionary History, Intron Dynamics and Phylogeny.来自昆虫病原真菌属的线粒体基因组揭示了进化历史、内含子动态和系统发育。
J Fungi (Basel). 2025 Jan 24;11(2):94. doi: 10.3390/jof11020094.
6
Phylogenetic Relationships of Three Species Based on Mitochondrial Genome Analysis.基于线粒体基因组分析的三种物种的系统发育关系
Ecol Evol. 2025 Feb 12;15(2):e70901. doi: 10.1002/ece3.70901. eCollection 2025 Feb.
7
The first complete mitochondrial genome of var. (Pers.) Pat. 1926 (Hymenochaetales: Hymenochaetaceae) and phylogenetic analysis.(栓菌属)变种(波斯.)帕特.1926年(刺革菌目:刺革菌科)的首个完整线粒体基因组及系统发育分析。
Mitochondrial DNA B Resour. 2024 Dec 8;9(12):1674-1678. doi: 10.1080/23802359.2024.2438275. eCollection 2024.
8
Comparative Mitogenomics Analysis Revealed Evolutionary Divergence among Species and Gene Arrangement and Intron Dynamics of Ophiocordycipitaceae.比较线粒体基因组学分析揭示了蛇孢虫草科物种间的进化差异以及基因排列和内含子动态变化。
Microorganisms. 2024 Oct 11;12(10):2053. doi: 10.3390/microorganisms12102053.
9
Analysis of the plant hormone expression profile during somatic embryogenesis induction in teak ().柚木体细胞胚胎发生诱导过程中植物激素表达谱分析()。 (注:括号内原文缺失内容)
Front Plant Sci. 2024 Oct 7;15:1429575. doi: 10.3389/fpls.2024.1429575. eCollection 2024.
10
Characterization of the complete mitochondrial genome of the medical fungus Boud., 1889 (Polyporales: taceae).医学真菌鲍氏木层孔菌(Boud.,1889)(多孔菌目:木层孔菌科)线粒体全基因组的特征分析
Mitochondrial DNA B Resour. 2024 Sep 30;9(10):1291-1297. doi: 10.1080/23802359.2024.2410449. eCollection 2024.