Suppr超能文献

TransMHCII:一种使用蛋白质语言模型和图像分类器构建的新型MHC-II结合预测模型。

TransMHCII: a novel MHC-II binding prediction model built using a protein language model and an image classifier.

作者信息

Yu Xin, Negron Christopher, Huang Lili, Veldman Geertruida

机构信息

Biotherapeutics Discovery, AbbVie Bioresearch Center, 100 Research Drive, Worcester, MA 01605, USA.

出版信息

Antib Ther. 2023 May 14;6(2):137-146. doi: 10.1093/abt/tbad011. eCollection 2023 Apr.

Abstract

The emergence of deep learning models such as AlphaFold2 has revolutionized the structure prediction of proteins. Nevertheless, much remains unexplored, especially on how we utilize structure models to predict biological properties. Herein, we present a method using features extracted from protein language models (PLMs) to predict the major histocompatibility complex class II (MHC-II) binding affinity of peptides. Specifically, we evaluated a novel transfer learning approach where the backbone of our model was interchanged with architectures designed for image classification tasks. Features extracted from several PLMs (ESM1b, ProtXLNet or ProtT5-XL-UniRef) were passed into image models (EfficientNet v2b0, EfficientNet v2m or ViT-16). The optimal pairing of the PLM and image classifier resulted in the final model TransMHCII, outperforming NetMHCIIpan 3.2 and NetMHCIIpan 4.0-BA on the receiver operating characteristic area under the curve, balanced accuracy and Jaccard scores. The architecture innovation may facilitate the development of other deep learning models for biological problems.

摘要

诸如AlphaFold2等深度学习模型的出现彻底改变了蛋白质的结构预测。然而,仍有许多未被探索的领域,特别是在我们如何利用结构模型来预测生物学特性方面。在此,我们提出了一种使用从蛋白质语言模型(PLM)中提取的特征来预测肽的主要组织相容性复合体II类(MHC-II)结合亲和力的方法。具体而言,我们评估了一种新颖的迁移学习方法,其中我们模型的主干被与为图像分类任务设计的架构进行了互换。从几个PLM(ESM1b、ProtXLNet或ProtT5-XL-UniRef)中提取的特征被输入到图像模型(EfficientNet v2b0、EfficientNet v2m或ViT-16)中。PLM和图像分类器的最佳配对产生了最终模型TransMHCII,在曲线下的受试者工作特征面积、平衡准确性和杰卡德分数方面优于NetMHCIIpan 3.2和NetMHCIIpan 4.0-BA。这种架构创新可能会促进用于生物学问题的其他深度学习模型的开发。

相似文献

本文引用的文献

6
Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。
Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.
7
ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning.ProtTrans:通过自监督学习理解生命语言。
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):7112-7127. doi: 10.1109/TPAMI.2021.3095381. Epub 2022 Sep 14.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验