Suppr超能文献

用于多目标蛋白质序列设计的帕累托最优采样

Pareto-optimal sampling for multi-objective protein sequence design.

作者信息

Luo Jiaqi, Ding Kerr, Luo Yunan

机构信息

School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA 30308, USA.

出版信息

iScience. 2025 Feb 27;28(3):112119. doi: 10.1016/j.isci.2025.112119. eCollection 2025 Mar 21.

Abstract

Supervised machine learning (ML) has significantly advanced sequence-based protein property prediction. However, its inverse application, designing protein sequences with desired properties, remains under-explored. The challenges in sequence design stem from the vast search space and the rugged protein fitness landscape. In this work, we present MosPro, an efficient ML algorithm for property-guided protein sequence design. We frame sequence design as a discrete sampling problem. Utilizing a pre-trained differentiable ML model that predicts properties of sequences, MosPro shapes a distribution that assigns high probability mass to regions for high-property sequences. To generate designs, MosPro efficiently samples sequences from this constructed distribution. We further develop a Pareto optimization algorithm to propose sequences that are simultaneously optimized for multiple properties. Evaluations on experimental fitness landscapes demonstrated that MosPro generates sequences that optimally trade off multiple desiderata. Our results suggested an unparalleled potential of generative ML for efficient and controllable design for functional proteins.

摘要

监督式机器学习(ML)在基于序列的蛋白质特性预测方面取得了显著进展。然而,其反向应用,即设计具有所需特性的蛋白质序列,仍有待深入探索。序列设计中的挑战源于巨大的搜索空间和崎岖的蛋白质适应度景观。在这项工作中,我们提出了MosPro,一种用于特性引导的蛋白质序列设计的高效机器学习算法。我们将序列设计框架化为一个离散采样问题。利用一个预训练的可微机器学习模型来预测序列的特性,MosPro塑造了一种分布,该分布将高概率质量分配给高特性序列的区域。为了生成设计,MosPro从这个构建的分布中高效地采样序列。我们进一步开发了一种帕累托优化算法,以提出针对多种特性同时进行优化的序列。对实验适应度景观的评估表明,MosPro生成的序列能够在多个需求之间进行最佳权衡。我们的结果表明,生成式机器学习在功能性蛋白质的高效和可控设计方面具有无与伦比的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3883/11952807/f6e0e16ab580/fx1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验