Department of Chemical and Biomolecular Engineering, University of California, Los Angeles, California, USA; email:
Department of Chemistry, The University of Chicago, Chicago, Illinois, USA.
Annu Rev Biophys. 2024 Jul;53(1):109-125. doi: 10.1146/annurev-biophys-030822-025038.
The relationship between genotype and phenotype, or the fitness landscape, is the foundation of genetic engineering and evolution. However, mapping fitness landscapes poses a major technical challenge due to the amount of quantifiable data that is required. Catalytic RNA is a special topic in the study of fitness landscapes due to its relatively small sequence space combined with its importance in synthetic biology. The combination of in vitro selection and high-throughput sequencing has recently provided empirical maps of both complete and local RNA fitness landscapes, but the astronomical size of sequence space limits purely experimental investigations. Next steps are likely to involve data-driven interpolation and extrapolation over sequence space using various machine learning techniques. We discuss recent progress in understanding RNA fitness landscapes, particularly with respect to protocells and machine representations of RNA. The confluence of technical advances may significantly impact synthetic biology in the near future.
基因型与表型(表型就是生物的表现型,是指生物体的形态、结构、生理特性和行为方式等)之间的关系,或者说适应度景观,是遗传工程和进化的基础。然而,由于需要大量可量化的数据,因此绘制适应度景观图是一个主要的技术挑战。催化 RNA 是适应度景观研究中的一个特殊课题,因为它的序列空间相对较小,同时在合成生物学中也很重要。体外选择和高通量测序的结合最近提供了完整和局部 RNA 适应度景观的经验图谱,但序列空间的巨大规模限制了纯粹的实验研究。下一步可能涉及使用各种机器学习技术在序列空间上进行数据驱动的插值和外推。我们讨论了近年来对 RNA 适应度景观的理解的最新进展,特别是在原细胞和 RNA 的机器表示方面。技术进步的融合可能会在不久的将来对合成生物学产生重大影响。