蛋白质信息资源：蛋白质功能注释的综合公共资源。

The Protein Information Resource: an integrated public resource of functional annotation of proteins.

作者信息

Wu Cathy H, Huang Hongzhan, Arminski Leslie, Castro-Alvear Jorge, Chen Yongxing, Hu Zhang-Zhi, Ledley Robert S, Lewis Kali C, Mewes Hans-Werner, Orcutt Bruce C, Suzek Baris E, Tsugita Akira, Vinayaka C R, Yeh Lai-Su L, Zhang Jian, Barker Winona C

机构信息

National Biomedical Research Foundation, Georgetown University Medical Center, 3900 Reservoir Road, NW, Washington, DC 20007, USA.

出版信息

Nucleic Acids Res. 2002 Jan 1;30(1):35-7. doi: 10.1093/nar/30.1.35.

DOI:10.1093/nar/30.1.35

PMID:11752247

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC99125/

Abstract

The Protein Information Resource (PIR) serves as an integrated public resource of functional annotation of protein data to support genomic/proteomic research and scientific discovery. The PIR, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the PIR-International Protein Sequence Database (PSD), the major annotated protein sequence database in the public domain, containing about 250 000 proteins. To improve protein annotation and the coverage of experimentally validated data, a bibliography submission system is developed for scientists to submit, categorize and retrieve literature information. Comprehensive protein information is available from iProClass, which includes family classification at the superfamily, domain and motif levels, structural and functional features of proteins, as well as cross-references to over 40 biological databases. To provide timely and comprehensive protein data with source attribution, we have introduced a non-redundant reference protein database, PIR-NREF. The database consists of about 800 000 proteins collected from PIR-PSD, SWISS-PROT, TrEMBL, GenPept, RefSeq and PDB, with composite protein names and literature data. To promote database interoperability, we provide XML data distribution and open database schema, and adopt common ontologies. The PIR web site (http://pir.georgetown.edu/) features data mining and sequence analysis tools for information retrieval and functional identification of proteins based on both sequence and annotation information. The PIR databases and other files are also available by FTP (ftp://nbrfa.georgetown.edu/pir_databases).

摘要

蛋白质信息资源（PIR）是蛋白质数据功能注释的综合公共资源，以支持基因组学/蛋白质组学研究和科学发现。PIR与慕尼黑蛋白质序列信息中心（MIPS）以及日本国际蛋白质信息数据库（JIPID）合作，创建了PIR国际蛋白质序列数据库（PSD），这是公共领域中主要的带注释蛋白质序列数据库，包含约250,000种蛋白质。为了改进蛋白质注释和实验验证数据的覆盖范围，开发了一个文献提交系统，供科学家提交、分类和检索文献信息。通过iProClass可获取全面的蛋白质信息，其中包括超家族、结构域和基序水平的家族分类、蛋白质的结构和功能特征，以及与40多个生物数据库的交叉引用。为了提供具有来源归属的及时、全面的蛋白质数据，我们引入了一个非冗余参考蛋白质数据库PIR-NREF。该数据库由从PIR-PSD、SWISS-PROT、TrEMBL、GenPept、RefSeq和PDB收集的约800,000种蛋白质组成，并带有复合蛋白质名称和文献数据。为了促进数据库的互操作性，我们提供XML数据分发和开放数据库模式，并采用通用本体。PIR网站（http://pir.georgetown.edu/）具有数据挖掘和序列分析工具，可基于序列和注释信息进行蛋白质信息检索和功能鉴定。PIR数据库和其他文件也可通过FTP（ftp://nbrfa.georgetown.edu/pir_databases）获取。

相似文献

The Protein Information Resource: an integrated public resource of functional annotation of proteins.蛋白质信息资源：蛋白质功能注释的综合公共资源。

Nucleic Acids Res. 2002 Jan 1;30(1):35-7. doi: 10.1093/nar/30.1.35.

The Protein Information Resource.蛋白质信息资源

Nucleic Acids Res. 2003 Jan 1;31(1):345-7. doi: 10.1093/nar/gkg040.

The protein information resource (PIR).蛋白质信息资源（PIR）。

Nucleic Acids Res. 2000 Jan 1;28(1):41-4. doi: 10.1093/nar/28.1.41.

Protein Information Resource: a community resource for expert annotation of protein data.蛋白质信息资源：一个用于蛋白质数据专家注释的社区资源。

Nucleic Acids Res. 2001 Jan 1;29(1):29-32. doi: 10.1093/nar/29.1.29.

iProClass: an integrated database of protein family, function and structure information.iProClass：一个蛋白质家族、功能及结构信息的综合数据库。

Nucleic Acids Res. 2003 Jan 1;31(1):390-2. doi: 10.1093/nar/gkg044.

Protein family classification and functional annotation.蛋白质家族分类与功能注释。

Comput Biol Chem. 2003 Feb;27(1):37-47. doi: 10.1016/s1476-9271(02)00098-1.

iProClass: an integrated, comprehensive and annotated protein classification database.iProClass：一个集成的、全面的且带有注释的蛋白质分类数据库。

Nucleic Acids Res. 2001 Jan 1;29(1):52-4. doi: 10.1093/nar/29.1.52.

The PIR-International Protein Sequence Database.PIR国际蛋白质序列数据库。

Nucleic Acids Res. 1999 Jan 1;27(1):39-43. doi: 10.1093/nar/27.1.39.

iProLINK: an integrated protein resource for literature mining.iProLINK：用于文献挖掘的综合蛋白质资源。

Comput Biol Chem. 2004 Dec;28(5-6):409-16. doi: 10.1016/j.compbiolchem.2004.09.010.

Update on genome completion and annotations: Protein Information Resource.基因组完成与注释的最新进展：蛋白质信息资源。

Hum Genomics. 2004 Mar;1(3):229-33. doi: 10.1186/1479-7364-1-3-229.

引用本文的文献

Protein structure prediction via deep learning: an in-depth review.基于深度学习的蛋白质结构预测：深入综述

Front Pharmacol. 2025 Apr 3;16:1498662. doi: 10.3389/fphar.2025.1498662. eCollection 2025.

Assessing Artificial Intelligence (AI) Implementation for Assisting Gene Linking (at the National Library of Medicine).评估人工智能（AI）在辅助基因关联方面的应用（于美国国立医学图书馆）

JAMIA Open. 2025 Jan 7;8(1):ooae129. doi: 10.1093/jamiaopen/ooae129. eCollection 2025 Feb.

Machine and Deep Learning for Prediction of Subcellular Localization.机器和深度学习在预测亚细胞定位中的应用。

Methods Mol Biol. 2021;2361:249-261. doi: 10.1007/978-1-0716-1641-3_15.

Assessing the accuracy of contact predictions in CASP13.评估 CASP13 中接触预测的准确性。

Proteins. 2019 Dec;87(12):1058-1068. doi: 10.1002/prot.25819. Epub 2019 Oct 24.

Pressing needs of biomedical text mining in biocuration and beyond: opportunities and challenges.生物编目及其他领域对生物医学文本挖掘的迫切需求：机遇与挑战。

Database (Oxford). 2016 Dec 26;2016. doi: 10.1093/database/baw161. Print 2016.

Prediction of Metabolic Pathway Involvement in Prokaryotic UniProtKB Data by Association Rule Mining.通过关联规则挖掘预测原核生物UniProtKB数据中的代谢途径参与情况

PLoS One. 2016 Jul 8;11(7):e0158896. doi: 10.1371/journal.pone.0158896. eCollection 2016.

The effect of growth rate on pyrazinamide activity in Mycobacterium tuberculosis - insights for early bactericidal activity?生长速率对结核分枝杆菌中吡嗪酰胺活性的影响——对早期杀菌活性的启示？

BMC Infect Dis. 2016 May 17;16:205. doi: 10.1186/s12879-016-1533-z.

A framework for organizing cancer-related variations from existing databases, publications and NGS data using a High-performance Integrated Virtual Environment (HIVE).一个使用高性能集成虚拟环境（HIVE）来整理来自现有数据库、出版物和NGS数据的癌症相关变异的框架。

Database (Oxford). 2014 Mar 25;2014:bau022. doi: 10.1093/database/bau022. Print 2014.

PFClust: a novel parameter free clustering algorithm.PFClust：一种新颖的无参数聚类算法。

BMC Bioinformatics. 2013 Jul 3;14:213. doi: 10.1186/1471-2105-14-213.

A rapid screening assay to search for phosphorylated proteins in tissue extracts.一种快速筛选测定法，用于在组织提取物中寻找磷酸化蛋白质。

PLoS One. 2012;7(11):e50025. doi: 10.1371/journal.pone.0050025. Epub 2012 Nov 15.

本文引用的文献

The RESID Database of protein structure modifications and the NRL-3D Sequence-Structure Database.蛋白质结构修饰的RESID数据库和NRL-3D序列-结构数据库。

Nucleic Acids Res. 2001 Jan 1;29(1):199-201. doi: 10.1093/nar/29.1.199.

RefSeq and LocusLink: NCBI gene-centered resources.参考序列和基因座链接：美国国立生物技术信息中心以基因为中心的资源。

Nucleic Acids Res. 2001 Jan 1;29(1):137-40. doi: 10.1093/nar/29.1.137.

iProClass: an integrated, comprehensive and annotated protein classification database.iProClass：一个集成的、全面的且带有注释的蛋白质分类数据库。

Nucleic Acids Res. 2001 Jan 1;29(1):52-4. doi: 10.1093/nar/29.1.52.

PIR: a new resource for bioinformatics.PIR：生物信息学的一种新资源。

Bioinformatics. 2000 Mar;16(3):290-1. doi: 10.1093/bioinformatics/16.3.290.

ProClass protein family database.ProClass蛋白质家族数据库。

Nucleic Acids Res. 2000 Jan 1;28(1):273-6. doi: 10.1093/nar/28.1.273.

The Pfam protein families database.Pfam蛋白质家族数据库。

Nucleic Acids Res. 2000 Jan 1;28(1):263-6. doi: 10.1093/nar/28.1.263.

The Protein Data Bank.蛋白质数据库。

Nucleic Acids Res. 2000 Jan 1;28(1):235-42. doi: 10.1093/nar/28.1.235.

The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000.2000年的SWISS-PROT蛋白质序列数据库及其补充数据库TrEMBL。

Nucleic Acids Res. 2000 Jan 1;28(1):45-8. doi: 10.1093/nar/28.1.45.

The PROSITE database, its status in 1999.PROSITE数据库及其1999年的状况。

Nucleic Acids Res. 1999 Jan 1;27(1):215-9. doi: 10.1093/nar/27.1.215.

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.空位BLAST和位置特异性迭代BLAST：新一代蛋白质数据库搜索程序。

Nucleic Acids Res. 1997 Sep 1;25(17):3389-402. doi: 10.1093/nar/25.17.3389.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验