Suppr超能文献

ARCADE:通过激活工程从基础模型进行可控密码子设计

ARCADE: Controllable Codon Design from Foundation Models via Activation Engineering.

作者信息

Li Jiayi, Liang Litian, Du Shiyi, Tang Shijie, Lai Hong-Sheng, Kingsford Carl

机构信息

Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, 15217, US.

出版信息

bioRxiv. 2025 Aug 23:2025.08.19.668819. doi: 10.1101/2025.08.19.668819.

Abstract

Codon sequence design is crucial for generating mRNA sequences with desired functional properties for tasks such as creating novel mRNA vaccines or gene editing therapies. Yet existing methods lack flexibility and controllability to adapt to various design objectives. We propose a novel framework, ARCADE, that enables flexible control over generated codon sequences. ARCADE is based on activation engineering and leverages inherent knowledge from pretrained genomic foundation models. Our approach extends activation engineering techniques beyond discrete feature manipulation to continuous biological metrics. Specifically, we define biologically meaningful semantic steering vectors in the model's activation space, which directly modulate continuous-valued properties such as the codon adaptation index, minimum free energy, and GC content without retraining. Experimental results demonstrate the superior performance and far greater flexibility of ARCADE compared to existing codon optimization approaches, underscoring its potential for advancing programmable biological sequence design.

摘要

密码子序列设计对于生成具有所需功能特性的mRNA序列至关重要,这些序列可用于诸如创建新型mRNA疫苗或基因编辑疗法等任务。然而,现有方法缺乏灵活性和可控性,无法适应各种设计目标。我们提出了一种新颖的框架ARCADE,它能够对生成的密码子序列进行灵活控制。ARCADE基于激活工程,并利用预训练基因组基础模型的固有知识。我们的方法将激活工程技术从离散特征操作扩展到连续生物学指标。具体而言,我们在模型的激活空间中定义具有生物学意义的语义引导向量,该向量可直接调节连续值属性,如密码子适应指数、最小自由能和GC含量,而无需重新训练。实验结果表明,与现有的密码子优化方法相比,ARCADE具有卓越的性能和更大的灵活性,突出了其在推进可编程生物序列设计方面的潜力。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验