Błażej Paweł, Wnetrzak Małgorzata, Mackiewicz Dorota, Mackiewicz Paweł
Department of Bioinformatics and Genomics, Faculty of Biotechnology, University of Wrocław, ul. Joliot-Curie 14a, Wrocław, Poland.
R Soc Open Sci. 2020 Feb 5;7(2):191384. doi: 10.1098/rsos.191384. eCollection 2020 Feb.
Compounds including non-canonical amino acids (ncAAs) or other artificially designed molecules can find a lot of applications in medicine, industry and biotechnology. They can be produced thanks to the modification or extension of the standard genetic code (SGC). Such peptides or proteins including the ncAAs can be constantly delivered in a stable way by organisms with the customized genetic code. Among several methods of engineering the code, using non-canonical base pairs is especially promising, because it enables generating many new codons, which can be used to encode any new amino acid. Since even one pair of new bases can extend the SGC up to 216 codons generated by a six-letter nucleotide alphabet, the extension of the SGC can be achieved in many ways. Here, we proposed a stepwise procedure of the SGC extension with one pair of non-canonical bases to minimize the consequences of point mutations. We reported relationships between codons in the framework of graph theory. All 216 codons were represented as nodes of the graph, whereas its edges were induced by all possible single nucleotide mutations occurring between codons. Therefore, every set of canonical and newly added codons induces a specific subgraph. We characterized the properties of the induced subgraphs generated by selected sets of codons. Thanks to that, we were able to describe a procedure for incremental addition of the set of meaningful codons up to the full coding system consisting of three pairs of bases. The procedure of gradual extension of the SGC makes the whole system robust to changing genetic information due to mutations and is compatible with the views assuming that codons and amino acids were added successively to the primordial SGC, which evolved minimizing harmful consequences of mutations or mistranslations of encoded proteins.
包含非标准氨基酸(ncAAs)或其他人工设计分子的化合物在医学、工业和生物技术领域有诸多应用。借助标准遗传密码(SGC)的修饰或扩展,它们得以被生产出来。包含ncAAs的此类肽或蛋白质可由具有定制遗传密码的生物体以稳定方式持续递送。在多种工程化遗传密码的方法中,使用非标准碱基对尤其具有前景,因为它能产生许多新密码子,可用于编码任何新氨基酸。由于哪怕一对新碱基就能将SGC扩展为由六字母核苷酸字母表产生的216个密码子,SGC的扩展可通过多种方式实现。在此,我们提出了一种用一对非标准碱基逐步扩展SGC的程序,以尽量减少点突变的影响。我们在图论框架下报告了密码子之间的关系。所有216个密码子都表示为图的节点,而其边由密码子之间所有可能的单核苷酸突变诱导产生。因此,每组标准密码子和新添加的密码子都会诱导出一个特定的子图。我们表征了由选定密码子集诱导产生的子图的性质。据此,我们能够描述一种程序,用于逐步添加有意义的密码子集,直至形成由三对碱基组成的完整编码系统。SGC的逐步扩展程序使整个系统对因突变而改变的遗传信息具有鲁棒性,并且与这样的观点兼容,即密码子和氨基酸是相继添加到原始SGC中的,原始SGC在进化过程中尽量减少编码蛋白质突变或错译的有害后果。