FSCC：基于冷冻电子断层扫描中对比学习和分布校准的大分子分类少样本学习

FSCC: Few-Shot Learning for Macromolecule Classification Based on Contrastive Learning and Distribution Calibration in Cryo-Electron Tomography.

作者信息

Gao Shan, Zeng Xiangrui, Xu Min, Zhang Fa

机构信息

High Performance Computer Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China.

University of Chinese Academy of Sciences, Beijing, China.

出版信息

Front Mol Biosci. 2022 Jul 5;9:931949. doi: 10.3389/fmolb.2022.931949. eCollection 2022.

DOI:10.3389/fmolb.2022.931949

PMID:35865006

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9294403/

Abstract

Cryo-electron tomography (Cryo-ET) is an emerging technology for three-dimensional (3D) visualization of macromolecular structures in the near-native state. To recover structures of macromolecules, millions of diverse macromolecules captured in tomograms should be accurately classified into structurally homogeneous subsets. Although existing supervised deep learning-based methods have improved classification accuracy, such trained models have limited ability to classify novel macromolecules that are unseen in the training stage. To adapt the trained model to the macromolecule classification of a novel class, massive labeled macromolecules of the novel class are needed. However, data labeling is very time-consuming and labor-intensive. In this work, we propose a novel few-shot learning method for the classification of novel macromolecules (named FSCC). A two-stage training strategy is designed in FSCC to enhance the generalization ability of the model to novel macromolecules. First, FSCC uses contrastive learning to pre-train the model on a sufficient number of labeled macromolecules. Second, FSCC uses distribution calibration to re-train the classifier, enabling the model to classify macromolecules of novel classes (unseen class in the pre-training). Distribution calibration transfers learned knowledge in the pre-training stage to novel macromolecules with limited labeled macromolecules of novel class. Experiments were performed on both synthetic and real datasets. On the synthetic datasets, compared with the state-of-the-art (SOTA) method based on supervised deep learning, FSCC achieves competitive performance. To achieve such performance, FSCC only needs five labeled macromolecules per novel class. However, the SOTA method needs 1100 ∼ 1500 labeled macromolecules per novel class. On the real datasets, FSCC improves the accuracy by 5% ∼ 16% when compared to the baseline model. These demonstrate good generalization ability of contrastive learning and calibration distribution to classify novel macromolecules with very few labeled macromolecules.

摘要

冷冻电子断层扫描（Cryo-ET）是一种用于近天然状态下大分子结构三维（3D）可视化的新兴技术。为了恢复大分子的结构，在断层扫描中捕获的数百万个不同的大分子应被准确分类为结构上均匀的子集。尽管现有的基于监督深度学习的方法提高了分类准确率，但这种经过训练的模型对训练阶段未见过的新型大分子进行分类的能力有限。为了使训练好的模型适应新类别的大分子分类，需要大量新类别的标记大分子。然而，数据标记非常耗时且劳动强度大。在这项工作中，我们提出了一种用于新型大分子分类的新型少样本学习方法（名为FSCC）。FSCC设计了一种两阶段训练策略，以增强模型对新型大分子的泛化能力。首先，FSCC使用对比学习在足够数量的标记大分子上对模型进行预训练。其次，FSCC使用分布校准对分类器进行重新训练，使模型能够对新类别的大分子（预训练中未见过的类）进行分类。分布校准将预训练阶段学到的知识转移到具有有限新类别标记大分子的新型大分子上。在合成数据集和真实数据集上都进行了实验。在合成数据集上，与基于监督深度学习的最先进（SOTA）方法相比，FSCC取得了有竞争力的性能。为了达到这样的性能，FSCC每个新类别只需要五个标记大分子。然而，SOTA方法每个新类别需要1100至1500个标记大分子。在真实数据集上，与基线模型相比，FSCC的准确率提高了5%至16%。这些结果表明对比学习和校准分布在使用极少标记大分子对新型大分子进行分类方面具有良好的泛化能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a02f/9294403/117cbbddea30/fmolb-09-931949-g001.jpg

相似文献

FSCC: Few-Shot Learning for Macromolecule Classification Based on Contrastive Learning and Distribution Calibration in Cryo-Electron Tomography.FSCC：基于冷冻电子断层扫描中对比学习和分布校准的大分子分类少样本学习

Front Mol Biosci. 2022 Jul 5;9:931949. doi: 10.3389/fmolb.2022.931949. eCollection 2022.

One-Shot Learning With Attention-Guided Segmentation in Cryo-Electron Tomography.冷冻电子断层扫描中基于注意力引导分割的一次性学习

Front Mol Biosci. 2021 Jan 12;7:613347. doi: 10.3389/fmolb.2020.613347. eCollection 2020.

Few-shot learning for classification of novel macromolecular structures in cryo-electron tomograms.基于小样本学习的冷冻电镜断层图像中新型高分子结构的分类。

PLoS Comput Biol. 2020 Nov 11;16(11):e1008227. doi: 10.1371/journal.pcbi.1008227. eCollection 2020 Nov.

SCL: Self-supervised contrastive learning for few-shot image classification.SCL：基于自监督对比学习的少样本图像分类。

Neural Netw. 2023 Aug;165:19-30. doi: 10.1016/j.neunet.2023.05.037. Epub 2023 May 24.

Few shot domain adaptation for in situ macromolecule structural classification in cryoelectron tomograms.在冷冻电子断层图像中进行原位大分子结构分类的小样本域自适应。

Bioinformatics. 2021 Apr 19;37(2):185-191. doi: 10.1093/bioinformatics/btaa671.

Macromolecules Structural Classification With a 3D Dilated Dense Network in Cryo-Electron Tomography.冷冻电子断层扫描中的 3D 扩张密集网络的大分子结构分类。

IEEE/ACM Trans Comput Biol Bioinform. 2022 Jan-Feb;19(1):209-219. doi: 10.1109/TCBB.2021.3065986. Epub 2022 Feb 3.

Few-shot disease recognition algorithm based on supervised contrastive learning.基于监督对比学习的少样本疾病识别算法

Front Plant Sci. 2024 Feb 7;15:1341831. doi: 10.3389/fpls.2024.1341831. eCollection 2024.

Self-supervised learning for macromolecular structure classification based on cryo-electron tomograms.基于冷冻电子断层扫描的大分子结构分类自监督学习

Front Physiol. 2022 Aug 30;13:957484. doi: 10.3389/fphys.2022.957484. eCollection 2022.

VP-Detector: A 3D multi-scale dense convolutional neural network for macromolecule localization and classification in cryo-electron tomograms.VP-Detector：一种用于冷冻电子断层图像中大分子定位和分类的 3D 多尺度密集卷积神经网络。

Comput Methods Programs Biomed. 2022 Jun;221:106871. doi: 10.1016/j.cmpb.2022.106871. Epub 2022 May 11.

Boosting few-shot rare skin disease classification via self-supervision and distribution calibration.通过自我监督和分布校准提升少样本罕见皮肤病分类

Biomed Eng Lett. 2024 May 20;14(4):877-889. doi: 10.1007/s13534-024-00383-2. eCollection 2024 Jul.

引用本文的文献

Few-shot classification of Cryo-ET subvolumes with deep Brownian distance covariance.基于深度布朗距离协方差的冷冻电镜子体积少样本分类

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae643.

An Unsupervised Classification Algorithm for Heterogeneous Cryo-EM Projection Images Based on Autoencoders.基于自动编码器的异质冷冻电镜投影图像无监督分类算法。

Int J Mol Sci. 2023 May 6;24(9):8380. doi: 10.3390/ijms24098380.

本文引用的文献

UNSUPERVISED DOMAIN ALIGNMENT BASED OPEN SET STRUCTURAL RECOGNITION OF MACROMOLECULES CAPTURED BY CRYO-ELECTRON TOMOGRAPHY.基于无监督域对齐的冷冻电子断层扫描捕获的大分子开放集结构识别

Proc Int Conf Image Proc. 2021 Sep;2021:106-110. doi: 10.1109/icip42928.2021.9506205. Epub 2021 Aug 23.

Bridging the Gap Between Few-Shot and Many-Shot Learning via Distribution Calibration.通过分布校准弥合少样本学习与多样本学习之间的差距

IEEE Trans Pattern Anal Mach Intell. 2022 Dec;44(12):9830-9843. doi: 10.1109/TPAMI.2021.3132021. Epub 2022 Nov 7.

Deep learning improves macromolecule identification in 3D cellular cryo-electron tomograms.深度学习可改善三维细胞冷冻电子断层扫描中的大分子识别。

Nat Methods. 2021 Nov;18(11):1386-1394. doi: 10.1038/s41592-021-01275-4. Epub 2021 Oct 21.

Current data processing strategies for cryo-electron tomography and subtomogram averaging.当前冷冻电子断层扫描和亚断层平均的数据分析处理策略。

Biochem J. 2021 May 28;478(10):1827-1845. doi: 10.1042/BCJ20200715.

Dilated-DenseNet For Macromolecule Classification In Cryo-electron Tomography.用于冷冻电子断层扫描中大分子分类的扩张密集网络

Bioinform Res Appl. 2020 Dec;12304:82-94. doi: 10.1007/978-3-030-57821-3_8. Epub 2020 Aug 18.

Macromolecules Structural Classification With a 3D Dilated Dense Network in Cryo-Electron Tomography.冷冻电子断层扫描中的 3D 扩张密集网络的大分子结构分类。

IEEE/ACM Trans Comput Biol Bioinform. 2022 Jan-Feb;19(1):209-219. doi: 10.1109/TCBB.2021.3065986. Epub 2022 Feb 3.

The Architecture of Inactivated SARS-CoV-2 with Postfusion Spikes Revealed by Cryo-EM and Cryo-ET.冷冻电镜和冷冻电子断层扫描揭示融合后刺突的失活 SARS-CoV-2 结构。

Structure. 2020 Nov 3;28(11):1218-1224.e4. doi: 10.1016/j.str.2020.10.001. Epub 2020 Oct 15.

Template-free detection and classification of membrane-bound complexes in cryo-electron tomograms.无模板检测和分类冷冻电子断层扫描中的膜结合复合物。

Nat Methods. 2020 Feb;17(2):209-216. doi: 10.1038/s41592-019-0675-5. Epub 2020 Jan 6.

De Novo Structural Pattern Mining in Cellular Electron Cryotomograms.细胞电子断层扫描图中的从头结构模式挖掘。

Structure. 2019 Apr 2;27(4):679-691.e14. doi: 10.1016/j.str.2019.01.005. Epub 2019 Feb 7.

emClarity: software for high-resolution cryo-electron tomography and subtomogram averaging.emClarity：用于高分辨率冷冻电子断层扫描和子断层平均的软件。

Nat Methods. 2018 Nov;15(11):955-961. doi: 10.1038/s41592-018-0167-z. Epub 2018 Oct 22.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

FSCC：基于冷冻电子断层扫描中对比学习和分布校准的大分子分类少样本学习

FSCC: Few-Shot Learning for Macromolecule Classification Based on Contrastive Learning and Distribution Calibration in Cryo-Electron Tomography.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献