SAHG，一个包含所有人类蛋白质预测结构的综合数据库。

SAHG, a comprehensive database of predicted structures of all human proteins.

作者信息

Motono Chie, Nakata Junichi, Koike Ryotaro, Shimizu Kana, Shirota Matsuyuki, Amemiya Takayuki, Tomii Kentaro, Nagano Nozomi, Sakaya Naofumi, Misoo Kiyotaka, Sato Miwa, Kidera Akinori, Hiroaki Hidekazu, Shirai Tsuyoshi, Kinoshita Kengo, Noguchi Tamotsu, Ota Motonori

机构信息

Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, Tokyo 135-0064, Japan.

出版信息

Nucleic Acids Res. 2011 Jan;39(Database issue):D487-93. doi: 10.1093/nar/gkq1057. Epub 2010 Nov 3.

DOI:10.1093/nar/gkq1057

PMID:21051360

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3013665/

Abstract

Most proteins from higher organisms are known to be multi-domain proteins and contain substantial numbers of intrinsically disordered (ID) regions. To analyse such protein sequences, those from human for instance, we developed a special protein-structure-prediction pipeline and accumulated the products in the Structure Atlas of Human Genome (SAHG) database at http://bird.cbrc.jp/sahg. With the pipeline, human proteins were examined by local alignment methods (BLAST, PSI-BLAST and Smith-Waterman profile-profile alignment), global-local alignment methods (FORTE) and prediction tools for ID regions (POODLE-S) and homology modeling (MODELLER). Conformational changes of protein models upon ligand-binding were predicted by simultaneous modeling using templates of apo and holo forms. When there were no suitable templates for holo forms and the apo models were accurate, we prepared holo models using prediction methods for ligand-binding (eF-seek) and conformational change (the elastic network model and the linear response theory). Models are displayed as animated images. As of July 2010, SAHG contains 42,581 protein-domain models in approximately 24,900 unique human protein sequences from the RefSeq database. Annotation of models with functional information and links to other databases such as EzCatDB, InterPro or HPRD are also provided to facilitate understanding the protein structure-function relationships.

摘要

已知大多数高等生物的蛋白质都是多结构域蛋白，并且含有大量的内在无序（ID）区域。为了分析这类蛋白质序列，例如人类的蛋白质序列，我们开发了一种特殊的蛋白质结构预测流程，并将结果积累到位于http://bird.cbrc.jp/sahg的人类基因组结构图谱（SAHG）数据库中。通过该流程，利用局部比对方法（BLAST、PSI-BLAST和Smith-Waterman profile-profile比对）、全局-局部比对方法（FORTE）以及ID区域预测工具（POODLE-S）和同源建模（MODELLER）对人类蛋白质进行了检测。通过使用无配体和有配体形式的模板进行同步建模，预测了蛋白质模型在配体结合时的构象变化。当没有合适的有配体形式的模板且无配体模型准确时，我们使用配体结合预测方法（eF-seek）和构象变化预测方法（弹性网络模型和线性响应理论）来构建有配体模型。模型以动画图像的形式展示。截至2010年7月，SAHG包含来自RefSeq数据库中约24,900个独特人类蛋白质序列的42,581个蛋白质结构域模型。还提供了带有功能信息的模型注释以及与其他数据库（如EzCatDB、InterPro或HPRD）的链接，以促进对蛋白质结构-功能关系的理解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3151/3013665/fbac4a80addf/gkq1057f1.jpg

相似文献

SAHG, a comprehensive database of predicted structures of all human proteins.

Nucleic Acids Res. 2011 Jan;39(Database issue):D487-93. doi: 10.1093/nar/gkq1057. Epub 2010 Nov 3.

MODBASE, a database of annotated comparative protein structure models, and associated resources.

Nucleic Acids Res. 2004 Jan 1;32(Database issue):D217-22. doi: 10.1093/nar/gkh095.

EzCatDB: the enzyme reaction database, 2015 update.

Nucleic Acids Res. 2015 Jan;43(Database issue):D453-8. doi: 10.1093/nar/gku946. Epub 2014 Oct 16.

ProDom: automated clustering of homologous domains.

Brief Bioinform. 2002 Sep;3(3):246-51. doi: 10.1093/bib/3.3.246.

EzCatDB: the Enzyme Catalytic-mechanism Database.

Nucleic Acids Res. 2005 Jan 1;33(Database issue):D407-12. doi: 10.1093/nar/gki080.

MODBASE, a database of annotated comparative protein structure models.

Nucleic Acids Res. 2002 Jan 1;30(1):255-9. doi: 10.1093/nar/30.1.255.

MODBASE: a database of annotated comparative protein structure models and associated resources.

Nucleic Acids Res. 2006 Jan 1;34(Database issue):D291-5. doi: 10.1093/nar/gkj059.

ModBase, a database of annotated comparative protein structure models, and associated resources.

Nucleic Acids Res. 2011 Jan;39(Database issue):D465-74. doi: 10.1093/nar/gkq1091. Epub 2010 Nov 19.

Domain-based small molecule binding site annotation.

BMC Bioinformatics. 2006 Mar 17;7:152. doi: 10.1186/1471-2105-7-152.

Accidental interaction between PDZ domains and diclofenac revealed by NMR-assisted virtual screening.

Molecules. 2013 Aug 9;18(8):9567-81. doi: 10.3390/molecules18089567.

引用本文的文献

Epitranscriptomics and epiproteomics in cancer drug resistance: therapeutic implications.

Signal Transduct Target Ther. 2020 Sep 8;5(1):193. doi: 10.1038/s41392-020-00300-w.

Discovery of Potent Disheveled/Dvl Inhibitors Using Virtual Screening Optimized With NMR-Based Docking Performance Index.

Front Pharmacol. 2018 Sep 5;9:983. doi: 10.3389/fphar.2018.00983. eCollection 2018.

KampoDB, database of predicted targets and functional annotations of natural medicines.

Sci Rep. 2018 Jul 25;8(1):11216. doi: 10.1038/s41598-018-29516-1.

Distinct distributions of genomic features of the 5' and 3' partners of coding somatic cancer gene fusions: arising mechanisms and functional implications.

Oncotarget. 2016 Jul 20;8(40):66769-66783. doi: 10.18632/oncotarget.10734. eCollection 2017 Sep 15.

Proteome-wide prediction of targets for aspirin: new insight into the molecular mechanism of aspirin.

PeerJ. 2016 Mar 10;4:e1791. doi: 10.7717/peerj.1791. eCollection 2016.

SDS, a structural disruption score for assessment of missense variant deleteriousness.

Front Genet. 2014 Apr 21;5:82. doi: 10.3389/fgene.2014.00082. eCollection 2014.

Accidental interaction between PDZ domains and diclofenac revealed by NMR-assisted virtual screening.

Molecules. 2013 Aug 9;18(8):9567-81. doi: 10.3390/molecules18089567.

本文引用的文献

A catalog of reference genomes from the human microbiome.

Science. 2010 May 21;328(5981):994-9. doi: 10.1126/science.1183605.

Impact of the 1000 genomes project on the next wave of pharmacogenomic discovery.

Pharmacogenomics. 2010 Feb;11(2):249-56. doi: 10.2217/pgs.09.173.

GPCR 3D homology models for ligand screening: lessons learned from blind predictions of adenosine A2a receptor complex.

Proteins. 2010 Jan;78(1):197-211. doi: 10.1002/prot.22507.

Protein structure prediction in structure-based ligand design and virtual screening.

Comb Chem High Throughput Screen. 2009 Dec;12(10):940-60. doi: 10.2174/138620709789824718.

Sequencing technologies - the next generation.

Nat Rev Genet. 2010 Jan;11(1):31-46. doi: 10.1038/nrg2626. Epub 2009 Dec 8.

Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays.

Science. 2010 Jan 1;327(5961):78-81. doi: 10.1126/science.1181498. Epub 2009 Nov 5.

The Universal Protein Resource (UniProt) in 2010.

Nucleic Acids Res. 2010 Jan;38(Database issue):D142-8. doi: 10.1093/nar/gkp846. Epub 2009 Oct 20.

I-TASSER: fully automated protein structure prediction in CASP8.

Proteins. 2009;77 Suppl 9(Suppl 9):100-13. doi: 10.1002/prot.22588.

Evaluation of template-based models in CASP8 with standard measures.

Proteins. 2009;77 Suppl 9(0 9):18-28. doi: 10.1002/prot.22561.

Application of 'next-generation' sequencing technologies to microbial genetics.

Nat Rev Microbiol. 2009 Apr;7(4):287-96. doi: 10.1038/nrmicro2122.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

SAHG，一个包含所有人类蛋白质预测结构的综合数据库。

SAHG, a comprehensive database of predicted structures of all human proteins.

作者信息

机构信息

Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, Tokyo 135-0064, Japan.

出版信息

Nucleic Acids Res. 2011 Jan;39(Database issue):D487-93. doi: 10.1093/nar/gkq1057. Epub 2010 Nov 3.

DOI:10.1093/nar/gkq1057

PMID:21051360

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3013665/

Abstract

摘要

SAHG，一个包含所有人类蛋白质预测结构的综合数据库。

SAHG, a comprehensive database of predicted structures of all human proteins.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

SAHG，一个包含所有人类蛋白质预测结构的综合数据库。

SAHG, a comprehensive database of predicted structures of all human proteins.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献