The Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, 166 10 Prague, Czech Republic.
Department of Low-Temperature Physics, Faculty of Mathematics and Physics, Charles University in Prague, 180 00 Prague, Czech Republic.
Molecules. 2021 Mar 17;26(6):1671. doi: 10.3390/molecules26061671.
Methods of artificial evolution such as SELEX and in vitro selection have made it possible to isolate RNA and DNA motifs with a wide range of functions from large random sequence libraries. Once the primary sequence of a functional motif is known, the sequence space around it can be comprehensively explored using a combination of random mutagenesis and selection. However, methods to explore the sequence space of a secondary structure are not as well characterized. Here we address this question by describing a method to construct libraries in a single synthesis which are enriched for sequences with the potential to form a specific secondary structure, such as that of an aptamer, ribozyme, or deoxyribozyme. Although interactions such as base pairs cannot be encoded in a library using conventional DNA synthesizers, it is possible to modulate the probability that two positions will have the potential to pair by biasing the nucleotide composition at these positions. Here we show how to maximize this probability for each of the possible ways to encode a pair (in this study defined as A-U or U-A or C-G or G-C or G.U or U.G). We then use these optimized coding schemes to calculate the number of different variants of model stems and secondary structures expected to occur in a library for a series of structures in which the number of pairs and the extent of conservation of unpaired positions is systematically varied. Our calculations reveal a tradeoff between maximizing the probability of forming a pair and maximizing the number of possible variants of a desired secondary structure that can occur in the library. They also indicate that the optimal coding strategy for a library depends on the complexity of the motif being characterized. Because this approach provides a simple way to generate libraries enriched for sequences with the potential to form a specific secondary structure, we anticipate that it should be useful for the optimization and structural characterization of functional nucleic acid motifs.
人工进化方法,如 SELEX 和体外选择,使得从大型随机序列文库中分离具有广泛功能的 RNA 和 DNA 基序成为可能。一旦功能基序的原始序列已知,就可以使用随机诱变和选择的组合来全面探索其周围的序列空间。然而,探索二级结构序列空间的方法还没有得到很好的描述。在这里,我们通过描述一种方法来解决这个问题,该方法可以在单次合成中构建文库,这些文库富含具有形成特定二级结构(如适体、核酶或脱氧核酶)潜力的序列。尽管使用传统的 DNA 合成仪无法在文库中编码碱基对等相互作用,但可以通过改变这些位置的核苷酸组成来调节两个位置具有潜在配对能力的概率。在这里,我们展示了如何为每种可能的编码方式(在本研究中定义为 A-U 或 U-A 或 C-G 或 G-C 或 G.U 或 U.G)最大化这种可能性。然后,我们使用这些优化的编码方案来计算一系列结构中发生的文库中预期出现的模型茎和二级结构的不同变体的数量,其中对碱基对的数量和未配对位置的保守程度进行了系统的改变。我们的计算揭示了形成碱基对的概率最大化和文库中可能出现的所需二级结构的变体数量最大化之间的权衡。它们还表明,文库的最佳编码策略取决于被表征的基序的复杂性。由于这种方法提供了一种生成富含具有形成特定二级结构潜力的序列的文库的简单方法,我们预计它将有助于功能核酸基序的优化和结构表征。