Section for Structural and Synthetic Biology, Department of Infectious Disease, Faculty of Medicine, Imperial College Road, South Kensington, London SW7 2BB, United Kingdom.
Scientific Computing Department, Science and Technology Facilities Council, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom.
J Struct Biol. 2020 Aug 1;211(2):107545. doi: 10.1016/j.jsb.2020.107545. Epub 2020 Jun 10.
Single particle analysis has become a key structural biology technique. Experimental images are extremely noisy, and during iterative refinement it is possible to stably incorporate noise into the reconstruction. Such "over-fitting" can lead to misinterpretation of the structure and flawed biological results. Several strategies are routinely used to prevent over-fitting, the most common being independent refinement of two sides of a split dataset. In this study, we show that over-fitting remains an issue within regions of low local signal-to-noise, despite independent refinement of half datasets. We propose a modification of the refinement process through the application of a local signal-to-noise filter: SIDESPLITTER. We show that our approach can reduce over-fitting for both idealised and experimental data while maintaining independence between the two sides of a split refinement. SIDESPLITTER refinement leads to improved density, and can also lead to improvement of the final resolution in extreme cases where datasets are prone to severe over-fitting, such as small membrane proteins.
单颗粒分析已成为一项关键的结构生物学技术。实验图像的噪声极大,在迭代精修过程中,噪声有可能稳定地被纳入重构中。这种“过度拟合”可能导致对结构的误解和有缺陷的生物学结果。目前有几种策略可用于防止过度拟合,最常见的策略是对分割数据集的两侧进行独立精修。在这项研究中,我们表明,尽管对半数据集进行了独立精修,但在局部信噪比低的区域,过度拟合仍然是一个问题。我们提出了通过应用局部信噪比滤波器来修改精修过程:SIDESPLITTER。我们表明,我们的方法可以减少理想化和实验数据的过度拟合,同时保持分割精修两侧的独立性。SIDESPLITTER 精修可以改善密度,并且在数据集容易过度拟合的极端情况下,例如小膜蛋白,也可以提高最终分辨率。