Department of Molecular Sciences, Macquarie University, Sydney, NSW, Australia.
Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA.
Nat Rev Genet. 2022 Mar;23(3):154-168. doi: 10.1038/s41576-021-00417-w. Epub 2021 Oct 5.
Modern genome-scale methods that identify new genes, such as proteogenomics and ribosome profiling, have revealed, to the surprise of many, that overlap in genes, open reading frames and even coding sequences is widespread and functionally integrated into prokaryotic, eukaryotic and viral genomes. In parallel, the constraints that overlapping regions place on genome sequences and their evolution can be harnessed in bioengineering to build more robust synthetic strains and constructs. With a focus on overlapping protein-coding and RNA-coding genes, this Review examines their discovery, topology and biogenesis in the context of their genome biology. We highlight exciting new uses for sequence overlap to control translation, compress synthetic genetic constructs, and protect against mutation.
现代的基因组规模方法,如蛋白质基因组学和核糖体分析,令人惊讶地发现,基因、开放阅读框甚至编码序列的重叠在原核生物、真核生物和病毒基因组中广泛存在且功能上整合在一起。与此同时,重叠区域对基因组序列及其进化的限制可以在生物工程中得到利用,以构建更稳健的合成菌株和构建体。本文重点关注重叠的蛋白质编码和 RNA 编码基因,研究了它们在基因组生物学背景下的发现、拓扑结构和生物发生。我们强调了序列重叠在控制翻译、压缩合成遗传构建体和防止突变方面的令人兴奋的新用途。