• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

卷积运算在蛋白质序列预训练方面与转换器竞争。

Convolutions are competitive with transformers for protein sequence pretraining.

机构信息

Microsoft Research New England, Cambridge, MA 02139, USA.

Microsoft Research New England, Cambridge, MA 02139, USA.

出版信息

Cell Syst. 2024 Mar 20;15(3):286-294.e2. doi: 10.1016/j.cels.2024.01.008. Epub 2024 Feb 29.

DOI:10.1016/j.cels.2024.01.008
PMID:38428432
Abstract

Pretrained protein sequence language models have been shown to improve the performance of many prediction tasks and are now routinely integrated into bioinformatics tools. However, these models largely rely on the transformer architecture, which scales quadratically with sequence length in both run-time and memory. Therefore, state-of-the-art models have limitations on sequence length. To address this limitation, we investigated whether convolutional neural network (CNN) architectures, which scale linearly with sequence length, could be as effective as transformers in protein language models. With masked language model pretraining, CNNs are competitive with, and occasionally superior to, transformers across downstream applications while maintaining strong performance on sequences longer than those allowed in the current state-of-the-art transformer models. Our work suggests that computational efficiency can be improved without sacrificing performance, simply by using a CNN architecture instead of a transformer, and emphasizes the importance of disentangling pretraining task and model architecture. A record of this paper's transparent peer review process is included in the supplemental information.

摘要

预先训练的蛋白质序列语言模型已被证明可以提高许多预测任务的性能,现在已被常规集成到生物信息学工具中。然而,这些模型在很大程度上依赖于转换器架构,其在运行时和内存中都与序列长度呈二次方比例增长。因此,最先进的模型在序列长度上存在限制。为了解决这个限制,我们研究了卷积神经网络(CNN)架构是否可以像转换器在蛋白质语言模型中一样有效。通过屏蔽语言模型预训练,CNN 在下游应用中与转换器竞争,并且偶尔优于转换器,同时在当前最先进的转换器模型允许的序列长度之外保持强大的性能。我们的工作表明,通过简单地使用 CNN 架构而不是转换器,在不牺牲性能的情况下可以提高计算效率,并强调了分离预训练任务和模型架构的重要性。本文透明同行评审过程的记录包含在补充信息中。

相似文献

1
Convolutions are competitive with transformers for protein sequence pretraining.卷积运算在蛋白质序列预训练方面与转换器竞争。
Cell Syst. 2024 Mar 20;15(3):286-294.e2. doi: 10.1016/j.cels.2024.01.008. Epub 2024 Feb 29.
2
Do it the transformer way: A comprehensive review of brain and vision transformers for autism spectrum disorder diagnosis and classification.采用变压器方法:自闭症谱系障碍诊断和分类的脑和视觉变压器的全面综述。
Comput Biol Med. 2023 Dec;167:107667. doi: 10.1016/j.compbiomed.2023.107667. Epub 2023 Nov 3.
3
Toward Foundational Deep Learning Models for Medical Imaging in the New Era of Transformer Networks.迈向Transformer网络新时代下的医学影像基础深度学习模型
Radiol Artif Intell. 2022 Nov 2;4(6):e210284. doi: 10.1148/ryai.210284. eCollection 2022 Nov.
4
A transformer architecture based on BERT and 2D convolutional neural network to identify DNA enhancers from sequence information.基于 BERT 和二维卷积神经网络的变压器架构,用于从序列信息中识别 DNA 增强子。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab005.
5
Vision Transformer-based recognition of diabetic retinopathy grade.基于 Vision Transformer 的糖尿病视网膜病变分级识别。
Med Phys. 2021 Dec;48(12):7850-7863. doi: 10.1002/mp.15312. Epub 2021 Nov 16.
6
Deep learning for mango leaf disease identification: A vision transformer perspective.用于芒果叶病识别的深度学习:视觉Transformer视角
Heliyon. 2024 Aug 22;10(17):e36361. doi: 10.1016/j.heliyon.2024.e36361. eCollection 2024 Sep 15.
7
A comparative study on deep learning models for text classification of unstructured medical notes with various levels of class imbalance.深度学习模型在不同类别不平衡程度的非结构化医疗记录文本分类中的对比研究。
BMC Med Res Methodol. 2022 Jul 2;22(1):181. doi: 10.1186/s12874-022-01665-y.
8
Hybrid transformer-CNN model for accurate prediction of peptide hemolytic potential.用于准确预测肽溶血潜力的混合变压器-CNN 模型。
Sci Rep. 2024 Jun 20;14(1):14263. doi: 10.1038/s41598-024-63446-5.
9
Masked inverse folding with sequence transfer for protein representation learning.用于蛋白质表示学习的带序列转移的掩码反向折叠
Protein Eng Des Sel. 2023 Jan 21;36. doi: 10.1093/protein/gzad015.
10
ConTraNet: A hybrid network for improving the classification of EEG and EMG signals with limited training data.ConTraNet:一种混合网络,用于在有限的训练数据下提高 EEG 和 EMG 信号的分类。
Comput Biol Med. 2024 Jan;168:107649. doi: 10.1016/j.compbiomed.2023.107649. Epub 2023 Nov 2.

引用本文的文献

1
Understanding Language Model Scaling on Protein Fitness Prediction.理解语言模型扩展在蛋白质适应性预测中的应用
bioRxiv. 2025 Jul 23:2025.04.25.650688. doi: 10.1101/2025.04.25.650688.
2
Artificial intelligence-driven computational methods for antibody design and optimization.用于抗体设计与优化的人工智能驱动的计算方法。
MAbs. 2025 Dec;17(1):2528902. doi: 10.1080/19420862.2025.2528902. Epub 2025 Jul 18.
3
Sequence modeling tools to decode the biosynthetic diversity of the human microbiome.用于解码人类微生物组生物合成多样性的序列建模工具。
mSystems. 2025 Jul 22;10(7):e0033325. doi: 10.1128/msystems.00333-25. Epub 2025 Jun 30.
4
ProtMamba: a homology-aware but alignment-free protein state space model.ProtMamba:一种同源性感知但无比对的蛋白质状态空间模型。
Bioinformatics. 2025 Jun 2;41(6). doi: 10.1093/bioinformatics/btaf348.
5
Ultrafast classical phylogenetic method beats large protein language models on variant effect prediction.超快经典系统发育方法在变异效应预测方面胜过大型蛋白质语言模型。
Adv Neural Inf Process Syst. 2024;37:130265-130290.
6
VenusMutHub: A systematic evaluation of protein mutation effect predictors on small-scale experimental data.金星突变中心:基于小规模实验数据对蛋白质突变效应预测因子的系统评估。
Acta Pharm Sin B. 2025 May;15(5):2454-2467. doi: 10.1016/j.apsb.2025.03.028. Epub 2025 Mar 14.
7
Multiobjective learning and design of bacteriophage specificity.噬菌体特异性的多目标学习与设计
bioRxiv. 2025 May 19:2025.05.19.654895. doi: 10.1101/2025.05.19.654895.
8
AMPGen: an evolutionary information-reserved and diffusion-driven generative model for de novo design of antimicrobial peptides.AMPGen:一种用于抗菌肽从头设计的保留进化信息和扩散驱动的生成模型。
Commun Biol. 2025 May 30;8(1):839. doi: 10.1038/s42003-025-08282-7.
9
Semantical and geometrical protein encoding toward enhanced bioactivity and thermostability.面向增强生物活性和热稳定性的语义和几何蛋白质编码
Elife. 2025 May 2;13:RP98033. doi: 10.7554/eLife.98033.
10
In-Context Learning can distort the relationship between sequence likelihoods and biological fitness.上下文学习可能会扭曲序列似然性与生物学适应性之间的关系。
ArXiv. 2025 Apr 23:arXiv:2504.17068v1.