Department of Computer Science, Valdosta State University, 1500 N Patterson St, Valdosta, GA 31698, USA.
Department of Computer Science, Old Dominion University, 5115 Hampton Blvd, Norfolk, VA 23529, USA.
Molecules. 2020 Mar 28;25(7):1540. doi: 10.3390/molecules25071540.
As more protein atomic structures are determined from cryo-electron microscopy (cryo-EM) density maps, validation of such structures is an important task. We applied a histogram-based outlier score (HBOS) to six sets of cryo-EM atomic structures and five sets of X-ray atomic structures, including one derived from X-ray data with better than 1.5 Å resolution. Cryo-EM data sets contain structures released by December 2016 and those released between 2017 and 2019, derived from resolution ranges 0-4 Å and 4-6 Å respectively. The distribution of HBOS values in five sets of X-ray structures show that HBOS is sensitive distinguishing sets of X-ray structures derived from different resolution ranges-higher than 1.5 Å, 1.5-2.0 Å, 2.0-2.5 Å, 2.5-3.0 Å, and 3.0-3.5 Å. The overall quality of cryo-EM structures is likely improved, as shown in a comparison of cryo-EM structures released before the end of 2016, those between 2017 and 2018, and those between 2018 and 2019. Our investigation shows that leucine (LEU) has a significantly higher rate of HBOS outliers than that of the reference data set (X-ray-1.5) and of other residue types in the cryo-EM data sets. HBOS was able to detect outliers for those residues that are currently marked as green in PDB validation reports. The HBOS profile of a dataset is a potential method to characterize the overall structural quality of the set. Residue LEU deserves special attention since it has a significantly higher HBOS outlier rate in sets of cryo-EM structures and those X-ray structures derived from X-ray data of lower than 2.5 Å resolutions. Most HBOS outlier residues from the EM-0-4-2019 set are located on loops for most types of residues.
随着越来越多的蛋白质原子结构从冷冻电子显微镜(cryo-EM)密度图中确定,此类结构的验证是一项重要任务。我们将基于直方图的异常值评分(HBOS)应用于六组 cryo-EM 原子结构和五组 X 射线原子结构,其中一组来自于分辨率优于 1.5Å 的 X 射线数据。cryo-EM 数据集包含截至 2016 年 12 月发布的结构以及分别来自分辨率范围 0-4Å 和 4-6Å 的 2017 年至 2019 年发布的结构。五组 X 射线结构中 HBOS 值的分布表明,HBOS 能够敏感地区分来自不同分辨率范围的 X 射线结构,这些分辨率范围分别为高于 1.5Å、1.5-2.0Å、2.0-2.5Å、2.5-3.0Å 和 3.0-3.5Å。正如比较 2016 年底之前发布的 cryo-EM 结构、2017 年至 2018 年之间发布的结构以及 2018 年至 2019 年之间发布的结构所示,cryo-EM 结构的整体质量可能有所提高。我们的研究表明,亮氨酸(LEU)的 HBOS 异常值率明显高于参考数据集(X-ray-1.5)和 cryo-EM 数据集中其他残基类型。HBOS 能够检测到pdb 验证报告中当前标记为绿色的残基的异常值。数据集的 HBOS 分布是一种描述数据集整体结构质量的潜在方法。由于在 cryo-EM 结构和低于 2.5Å 分辨率的 X 射线数据的 X 射线结构中,LEU 残基的 HBOS 异常值率明显更高,因此 LEU 残基值得特别关注。来自 EM-0-4-2019 数据集的大多数 HBOS 异常值残基位于大多数残基类型的环上。