Kantardjieff Katherine A, Rupp Bernhard
Department of Chemistry and Biochemistry, California State University (CSU) Fullerton, Fullerton, California 92834-6866, USA.
Protein Sci. 2003 Sep;12(9):1865-71. doi: 10.1110/ps.0350503.
Estimating the number of molecules in the crystallographic asymmetric unit is one of the first steps in a macromolecular structure determination. Based on a survey of 15641 crystallographic Protein Data Bank (PDB) entries the distribution of V(M), the crystal volume per unit of protein molecular weight, known as Matthews coefficient, has been reanalyzed. The range of values and frequencies has changed in the 30 years since Matthews first analysis of protein crystal solvent content. In the statistical analysis, complexes of proteins and nucleic acids have been treated as a separate group. In addition, the V(M) distribution for nucleic acid crystals has been examined for the first time. Observing that resolution is a significant discriminator of V(M), an improved estimator for the probabilities of the number of molecules in the crystallographic asymmetric unit has been implemented, using resolution as additional information.
估算晶体学不对称单元中的分子数量是大分子结构解析的首要步骤之一。基于对15641个晶体学蛋白质数据库(PDB)条目的调查,对每单位蛋白质分子量的晶体体积V(M)(即马修斯系数)的分布进行了重新分析。自马修斯首次分析蛋白质晶体溶剂含量以来的30年里,其数值范围和频率发生了变化。在统计分析中,蛋白质与核酸的复合物被视为一个单独的组。此外,还首次研究了核酸晶体的V(M)分布。鉴于分辨率是V(M)的一个重要判别因素,利用分辨率作为附加信息,实现了一种改进的估算晶体学不对称单元中分子数量概率的方法。