Department of Electronic Engineering, Tsinghua University, Beijing, 100084, China.
Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing, 100084, China.
Nat Commun. 2022 May 5;13(1):2468. doi: 10.1038/s41467-022-29994-y.
Deep learning is a popular method for facilitating particle picking in single-particle cryo-electron microscopy (cryo-EM), which is essential for developing automated processing pipelines. Most existing deep learning algorithms for particle picking rely on supervised learning where the features to be identified must be provided through a training procedure. However, the generalization performance of these algorithms on unseen datasets with different features is often unpredictable. In addition, while they perform well on the latest training datasets, these algorithms often fail to maintain the knowledge of old particles. Here, we report an exemplar-based continual learning approach, which can accumulate knowledge from the new dataset into the model by training an existing model on only a few new samples without catastrophic forgetting of old knowledge, implemented in a program called EPicker. Therefore, the ability of EPicker to identify bio-macromolecules can be expanded by continuously learning new knowledge during routine particle picking applications. Powered by the improved training strategy, EPicker is designed to pick not only protein particles but also general biological objects such as vesicles and fibers.
深度学习是一种在单颗粒冷冻电子显微镜(cryo-EM)中进行颗粒挑选的流行方法,这对于开发自动化处理管道至关重要。大多数现有的用于颗粒挑选的深度学习算法都依赖于监督学习,其中必须通过训练过程提供要识别的特征。然而,这些算法在具有不同特征的未见数据集上的泛化性能往往是不可预测的。此外,虽然它们在最新的训练数据集中表现良好,但这些算法往往无法保持对旧颗粒的知识。在这里,我们报告了一种基于范例的连续学习方法,该方法可以通过仅在几个新样本上训练现有模型,将新知识从新数据集累积到模型中,而不会灾难性地忘记旧知识,该方法在名为 EPicker 的程序中实现。因此,EPicker 通过在常规颗粒挑选应用程序中不断学习新知识,能够扩展识别生物大分子的能力。得益于改进的训练策略,EPicker 不仅可以挑选蛋白质颗粒,还可以挑选囊泡和纤维等一般生物物体。