Rapid in silico directed evolution by a protein language model with EVOLVEpro.

作者信息

Jiang Kaiyi, Yan Zhaoqing, Di Bernardo Matteo, Sgrizzi Samantha R, Villiger Lukas, Kayabolen Alisan, Kim B J, Carscadden Josephine K, Hiraizumi Masahiro, Nishimasu Hiroshi, Gootenberg Jonathan S, Abudayyeh Omar O

机构信息

Department of Medicine Division of Engineering in Medicine Brigham and Women's Hospital Harvard Medical School, Boston, MA, USA.

Gene and Cell Therapy Institute Mass General Brigham, Cambridge, MA, USA.

出版信息

Science. 2025 Jan 24;387(6732):eadr6006. doi: 10.1126/science.adr6006.

Abstract

Directed protein evolution is central to biomedical applications but faces challenges such as experimental complexity, inefficient multiproperty optimization, and local maxima traps. Although in silico methods that use protein language models (PLMs) can provide modeled fitness landscape guidance, they struggle to generalize across diverse protein families and map to protein activity. We present EVOLVEpro, a few-shot active learning framework that combines PLMs and regression models to rapidly improve protein activity. EVOLVEpro surpasses current methods, yielding up to 100-fold improvements in desired properties. We demonstrate its effectiveness across six proteins in RNA production, genome editing, and antibody binding applications. These results highlight the advantages of few-shot active learning with minimal experimental data over zero-shot predictions. EVOLVEpro opens new possibilities for artificial intelligence-guided protein engineering in biology and medicine.

摘要

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索