• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过序列加权结构比对进行功能注释:来自日本蛋白质3000结构基因组学项目的统计分析与案例研究

Functional annotation by sequence-weighted structure alignments: statistical analysis and case studies from the Protein 3000 structural genomics project in Japan.

作者信息

Standley Daron M, Toh Hiroyuki, Nakamura Haruki

机构信息

Research Center for Structural and Functional Proteomics, Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita, Osaka 565-0871, Japan.

出版信息

Proteins. 2008 Sep;72(4):1333-51. doi: 10.1002/prot.22015.

DOI:10.1002/prot.22015
PMID:18384072
Abstract

A method to functionally annotate structural genomics targets, based on a novel structural alignment scoring function, is proposed. In the proposed score, position-specific scoring matrices are used to weight structurally aligned residue pairs to highlight evolutionarily conserved motifs. The functional form of the score is first optimized for discriminating domains belonging to the same Pfam family from domains belonging to different families but the same CATH or SCOP superfamily. In the optimization stage, we consider four standard weighting functions as well as our own, the "maximum substitution probability," and combinations of these functions. The optimized score achieves an area of 0.87 under the receiver-operating characteristic curve with respect to identifying Pfam families within a sequence-unique benchmark set of domain pairs. Confidence measures are then derived from the benchmark distribution of true-positive scores. The alignment method is next applied to the task of functionally annotating 230 query proteins released to the public as part of the Protein 3000 structural genomics project in Japan. Of these queries, 78 were found to align to templates with the same Pfam family as the query or had sequence identities > or = 30%. Another 49 queries were found to match more distantly related templates. Within this group, the template predicted by our method to be the closest functional relative was often not the most structurally similar. Several nontrivial cases are discussed in detail. Finally, 103 queries matched templates at the fold level, but not the family or superfamily level, and remain functionally uncharacterized.

摘要

提出了一种基于新型结构比对评分函数对结构基因组学靶点进行功能注释的方法。在所提出的评分中,使用位置特异性评分矩阵对结构比对的残基对进行加权,以突出进化上保守的基序。评分的函数形式首先针对区分属于同一Pfam家族的结构域与属于不同家族但属于同一CATH或SCOP超家族的结构域进行优化。在优化阶段,我们考虑了四个标准加权函数以及我们自己的“最大替换概率”函数,以及这些函数的组合。在识别结构域对的序列唯一基准集中的Pfam家族方面,优化后的评分在接收者操作特征曲线下的面积达到了0.87。然后从真阳性评分的基准分布中得出置信度度量。接下来,将比对方法应用于对作为日本蛋白质3000结构基因组学项目一部分向公众发布的230个查询蛋白进行功能注释的任务。在这些查询中,发现78个与具有与查询相同Pfam家族的模板比对,或者序列同一性≥30%。另外49个查询被发现与关系更远的模板匹配。在这一组中,我们的方法预测为最接近功能相关的模板通常不是结构上最相似的。详细讨论了几个重要的案例。最后,103个查询在折叠水平上与模板匹配,但在家族或超家族水平上不匹配,并且在功能上仍然未得到表征。

相似文献

1
Functional annotation by sequence-weighted structure alignments: statistical analysis and case studies from the Protein 3000 structural genomics project in Japan.通过序列加权结构比对进行功能注释:来自日本蛋白质3000结构基因组学项目的统计分析与案例研究
Proteins. 2008 Sep;72(4):1333-51. doi: 10.1002/prot.22015.
2
Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores.评估基因组学中的注释转移:通过传统分数和概率分数量化蛋白质序列、结构与功能之间的关系。
J Mol Biol. 2000 Mar 17;297(1):233-49. doi: 10.1006/jmbi.2000.3550.
3
Estimating quality of template-based protein models by alignment stability.通过比对稳定性评估基于模板的蛋白质模型的质量。
Proteins. 2008 May 15;71(3):1255-74. doi: 10.1002/prot.21819.
4
Fast model-based protein homology detection without alignment.基于快速模型的无需比对的蛋白质同源性检测。
Bioinformatics. 2007 Jul 15;23(14):1728-36. doi: 10.1093/bioinformatics/btm247. Epub 2007 May 8.
5
Accurate domain identification with structure-anchored hidden Markov models, saHMMs.基于结构锚定隐马尔可夫模型(saHMMs)的精确领域识别。
Proteins. 2009 Aug 1;76(2):343-52. doi: 10.1002/prot.22349.
6
SUPFAM: a database of sequence superfamilies of protein domains.SUPFAM:一个蛋白质结构域序列超家族数据库。
BMC Bioinformatics. 2004 Mar 15;5:28. doi: 10.1186/1471-2105-5-28.
7
Statistical potential-based amino acid similarity matrices for aligning distantly related protein sequences.用于比对远缘相关蛋白质序列的基于统计势的氨基酸相似性矩阵。
Proteins. 2006 Aug 15;64(3):587-600. doi: 10.1002/prot.21020.
8
Predicted role for the archease protein family based on structural and sequence analysis of TM1083 and MTH1598, two proteins structurally characterized through structural genomics efforts.基于TM1083和MTH1598这两种通过结构基因组学研究确定了结构特征的蛋白质的结构和序列分析,对解旋酶蛋白家族的预测作用。
Proteins. 2004 Jul 1;56(1):19-27. doi: 10.1002/prot.20141.
9
AutoSCOP: automated prediction of SCOP classifications using unique pattern-class mappings.AutoSCOP:使用独特的模式-类别映射自动预测SCOP分类
Bioinformatics. 2007 May 15;23(10):1203-10. doi: 10.1093/bioinformatics/btm089. Epub 2007 Mar 22.
10
Protein structure mining using a structural alphabet.使用结构字母表进行蛋白质结构挖掘。
Proteins. 2008 May 1;71(2):920-37. doi: 10.1002/prot.21776.

引用本文的文献

1
Some reflections on a career in science and a note of thanks to the contributors of this Special Issue.对科学职业生涯的一些思考以及对本期特刊贡献者的致谢。
Biophys Rev. 2022 Dec 20;14(6):1223-1226. doi: 10.1007/s12551-022-01035-4. eCollection 2022 Dec.
2
Genomes to hits in silico - a country path today, a highway tomorrow: a case study of chikungunya.从基因组到计算机预测——今天的乡间小路,明天的高速公路:基孔肯雅热的案例研究。
Curr Pharm Des. 2013;19(26):4687-700. doi: 10.2174/13816128113199990379.
3
SeSAW: balancing sequence and structural information in protein functional mapping.
SeSAW:在蛋白质功能映射中平衡序列和结构信息。
Bioinformatics. 2010 May 1;26(9):1258-9. doi: 10.1093/bioinformatics/btq116. Epub 2010 Mar 17.
4
A single polymorphic amino acid on Toxoplasma gondii kinase ROP16 determines the direct and strain-specific activation of Stat3.弓形虫激酶ROP16上的单个多态性氨基酸决定了Stat3的直接且菌株特异性激活。
J Exp Med. 2009 Nov 23;206(12):2747-60. doi: 10.1084/jem.20091703. Epub 2009 Nov 9.