Suppr超能文献

人巨细胞病毒(HHV5)基因组的计算机结构与功能分析

In silico structural and functional analysis of the human cytomegalovirus (HHV5) genome.

作者信息

Novotny J, Rigoutsos I, Coleman D, Shenk T

机构信息

Victor Chang Cardiac Research Institute, Darlinghurst, NSW, Australia.

出版信息

J Mol Biol. 2001 Jul 27;310(5):1151-66. doi: 10.1006/jmbi.2001.4798.

Abstract

The open reading frames of human cytomegalovirus (human herpesvirus-5, HHV5) encode some 213 unique proteins with mostly unknown functions. Using the threading program, ProCeryon, we calculated possible matches between the amino acid sequences of these proteins and the Protein Data Bank library of three-dimensional structures. Thirty-six proteins were fully identified in terms of their structure and, often, function; 65 proteins were recognized as members of narrow structural/functional families (e.g. DNA-binding factors, cytokines, enzymes, signaling particles, cell surface receptors etc.); and 87 proteins were assigned to broad structural classes (e.g. all-beta, 3-layer-alphabetaalpha, multidomain, etc.). Genes encoding proteins with similar folds, or containing identical structural traits (extreme sequence length, runs of unstructured (Pro and/or Gly-rich) residues, transmembrane segments, etc.) often formed tandem clusters throughout the genome. In the course of this work, benchmarks on about 20 known folds were used to optimize adjustable parameters of threading calculations, i.e. gap penalty weights used in sequence/structure alignments; new scores obtained as simple combinations of existing scoring functions; and number of threading runs conducive to meaningful results. An introduction of summed, per-residue-normalized scores has been essential for discovery of subdomains (EGF-like, SH2, SH3) in longer protein sequences, such as the eight "open sandwich" cytokine domains, 60-70 amino acids long and having the 3beta1alpha fold with one or two disulfide bridges, present in otherwise unrelated proteins.

摘要

人类巨细胞病毒(人类疱疹病毒5型,HHV5)的开放阅读框编码约213种独特蛋白质,其功能大多未知。我们使用穿线程序ProCeryon,计算了这些蛋白质的氨基酸序列与三维结构的蛋白质数据库库之间的可能匹配。36种蛋白质在结构和功能方面得到了完全鉴定;65种蛋白质被识别为狭窄结构/功能家族的成员(如DNA结合因子、细胞因子、酶、信号颗粒、细胞表面受体等);87种蛋白质被归入广泛的结构类别(如全β、3层αβ α、多结构域等)。编码具有相似折叠或包含相同结构特征(极端序列长度、无结构(富含脯氨酸和/或甘氨酸)残基的连续序列、跨膜片段等)的蛋白质的基因,在整个基因组中常常形成串联簇。在这项工作过程中,使用了约20种已知折叠的基准来优化穿线计算的可调参数,即序列/结构比对中使用的空位罚分权重;作为现有评分函数的简单组合获得的新分数;以及有利于获得有意义结果的穿线运行次数。引入按残基归一化的总分对于在较长蛋白质序列中发现亚结构域(如表皮生长因子样、SH2、SH3)至关重要,例如在其他不相关蛋白质中存在的八个“开放三明治”细胞因子结构域,其长度为60 - 70个氨基酸,具有3β1α折叠且带有一个或两个二硫键。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验