Mallick P, Goodwill K E, Fitz-Gibbon S, Miller J H, Eisenberg D
UCLA-DOE Laboratory of Structural Biology and Molecular Medicine, Department of Chemistry and Biochemistry, Molecular Biology Institute, Box 951570, University of California, Los Angeles, CA 90095-1570, USA.
Proc Natl Acad Sci U S A. 2000 Mar 14;97(6):2450-5. doi: 10.1073/pnas.050589297.
Three-dimensional protein folds were assigned to all ORFs of the recently sequenced genome of the hyperthermophilic archaeon Pyrobaculum aerophilum. Binary hypothesis testing was used to estimate a confidence level for each assignment. A separate test was conducted to assign a probability for whether each sequence has a novel fold-i.e., one that is not yet represented in the experimental database of known structures. Of the 2,130 predicted nontransmembrane proteins in this organism, 916 matched a fold at a cumulative 90% confidence level, and 245 could be assigned at a 99% confidence level. Likewise, 286 proteins were predicted to have a previously unobserved fold with a 90% confidence level, and 14 at a 99% confidence level. These statistically based tools are combined with homology searches against the Online Mendelian Inheritance in Man (OMIM) human genetics database and other protein databases for the selection of attractive targets for crystallographic or NMR structure determination. Results of these studies have been collated and placed at http://www.doe-mbi.ucla.edu/people/parag/P A_HOME/, the University of California, Los Angeles-Department of Energy Pyrobaculum aerophilum web site.
已将三维蛋白质折叠结构分配给嗜热古菌嗜气栖热菌最近测序基因组的所有开放阅读框(ORF)。采用二元假设检验来估计每个分配的置信水平。进行了一项单独测试,以确定每个序列具有新折叠结构(即已知结构的实验数据库中尚未出现的折叠结构)的概率。在该生物体预测的2130个非跨膜蛋白中,916个在累积90%置信水平下匹配一种折叠结构,245个可在99%置信水平下进行分配。同样,预测有286个蛋白具有此前未观察到的折叠结构,置信水平为90%,14个置信水平为99%。这些基于统计的工具与针对《人类孟德尔遗传在线》(OMIM)人类遗传学数据库及其他蛋白质数据库的同源性搜索相结合,以选择用于晶体学或核磁共振结构测定的有吸引力的目标。这些研究结果已整理并发布在加利福尼亚大学洛杉矶分校 - 能源部嗜气栖热菌网站http://www.doe-mbi.ucla.edu/people/parag/P A_HOME/ 上。