Suppr超能文献

CELL-E 2:使用双向文本到图像变换器将蛋白质转化为图像并还原

CELL-E 2: Translating Proteins to Pictures and Back with a Bidirectional Text-to-Image Transformer.

作者信息

Khwaja Emaad, Song Yun S, Agarunov Aaron, Huang Bo

机构信息

UC Berkeley - UCSF Joint Bioengineering Graduate Program.

Computer Science Division, UC Berkeley, CA 94720.

出版信息

Adv Neural Inf Process Syst. 2023 Dec;36:4899-4914.

Abstract

We present CELL-E 2, a novel bidirectional transformer that can generate images depicting protein subcellular localization from the amino acid sequences (and ). Protein localization is a challenging problem that requires integrating sequence and image information, which most existing methods ignore. CELL-E 2 extends the work of CELL-E, not only capturing the spatial complexity of protein localization and produce probability estimates of localization atop a nucleus image, but also being able to generate sequences from images, enabling protein design. We train and finetune CELL-E 2 on two large-scale datasets of human proteins. We also demonstrate how to use CELL-E 2 to create hundreds of novel nuclear localization signals (NLS). Results and interactive demos are featured at https://bohuanglab.github.io/CELL-E_2/.

摘要

我们展示了CELL-E 2,这是一种新型的双向变压器,它可以从氨基酸序列生成描绘蛋白质亚细胞定位的图像。蛋白质定位是一个具有挑战性的问题,需要整合序列和图像信息,而大多数现有方法都忽略了这一点。CELL-E 2扩展了CELL-E的工作,不仅捕捉蛋白质定位的空间复杂性并在细胞核图像上生成定位概率估计,还能够从图像生成序列,从而实现蛋白质设计。我们在两个人类蛋白质的大规模数据集上训练和微调CELL-E 2。我们还展示了如何使用CELL-E 2创建数百个新型核定位信号(NLS)。结果和交互式演示见https://bohuanglab.github.io/CELL-E_2/

相似文献

2
CELL-E: A Text-To-Image Transformer for Protein Localization Prediction.
Res Sq. 2023 Jun 2:rs.3.rs-2963881. doi: 10.21203/rs.3.rs-2963881/v1.
3
Nuclear targeting of the maize R protein requires two nuclear localization sequences.
Plant Physiol. 1993 Feb;101(2):353-61. doi: 10.1104/pp.101.2.353.
5
MFMSNet: A Multi-frequency and Multi-scale Interactive CNN-Transformer Hybrid Network for breast ultrasound image segmentation.
Comput Biol Med. 2024 Jul;177:108616. doi: 10.1016/j.compbiomed.2024.108616. Epub 2024 May 15.
6
ETUNet:Exploring efficient transformer enhanced UNet for 3D brain tumor segmentation.
Comput Biol Med. 2024 Mar;171:108005. doi: 10.1016/j.compbiomed.2024.108005. Epub 2024 Jan 23.
8
A novel set of nuclear localization signals determine distributions of the alphaCP RNA-binding proteins.
Mol Cell Biol. 2003 Dec;23(23):8405-15. doi: 10.1128/MCB.23.23.8405-8415.2003.
10

引用本文的文献

1
Riemannian Manifolds for Biological Imaging Applications Based on Unsupervised Learning.
J Imaging. 2025 Mar 29;11(4):103. doi: 10.3390/jimaging11040103.

本文引用的文献

1
Evolutionary-scale prediction of atomic-level protein structure with a language model.
Science. 2023 Mar 17;379(6637):1123-1130. doi: 10.1126/science.ade2574. Epub 2023 Mar 16.
2
Improving the secretion of designed protein assemblies through negative design of cryptic transmembrane domains.
Proc Natl Acad Sci U S A. 2023 Mar 14;120(11):e2214556120. doi: 10.1073/pnas.2214556120. Epub 2023 Mar 8.
3
Large language models generate functional protein sequences across diverse families.
Nat Biotechnol. 2023 Aug;41(8):1099-1106. doi: 10.1038/s41587-022-01618-2. Epub 2023 Jan 26.
4
Cellpose 2.0: how to train your own model.
Nat Methods. 2022 Dec;19(12):1634-1641. doi: 10.1038/s41592-022-01663-4. Epub 2022 Nov 7.
5
Accurate de novo design of membrane-traversing macrocycles.
Cell. 2022 Sep 15;185(19):3520-3532.e26. doi: 10.1016/j.cell.2022.07.019. Epub 2022 Aug 29.
6
DeepLoc 2.0: multi-label subcellular localization prediction using protein language models.
Nucleic Acids Res. 2022 Jul 5;50(W1):W228-W234. doi: 10.1093/nar/gkac278.
7
OpenCell: Endogenous tagging for the cartography of human cellular organization.
Science. 2022 Mar 11;375(6585):eabi6983. doi: 10.1126/science.abi6983.
8
Computational methods for protein localization prediction.
Comput Struct Biotechnol J. 2021 Oct 19;19:5834-5844. doi: 10.1016/j.csbj.2021.10.023. eCollection 2021.
9
Subcellular proteomics.
Nat Rev Methods Primers. 2021;1. doi: 10.1038/s43586-021-00029-y. Epub 2021 Apr 29.
10
MULocDeep: A deep-learning framework for protein subcellular and suborganellar localization prediction with residue-level interpretation.
Comput Struct Biotechnol J. 2021 Aug 18;19:4825-4839. doi: 10.1016/j.csbj.2021.08.027. eCollection 2021.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验