Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York 14642, United States.
Department of Chemistry & Biochemistry, University of Maryland, College Park, Maryland 20742, United States.
ACS Synth Biol. 2023 Sep 15;12(9):2750-2763. doi: 10.1021/acssynbio.3c00358. Epub 2023 Sep 6.
We show that design of DNA secondary structures is improved by extending the base pairing alphabet beyond A-T and G-C to include the pair between 2-amino-8-(1'-β-d-2'-deoxyribofuranosyl)-imidazo-[1,2-]-1,3,5-triazin-(8)-4-one and 6-amino-3-(1'-β-d-2'-deoxyribofuranosyl)-5-nitro-(1)-pyridin-2-one, abbreviated as and . To obtain the thermodynamic parameters needed to include P-Z pairs in the designs, we performed 47 optical melting experiments and combined the results with previous work to fit free energy and enthalpy nearest neighbor folding parameters for P-Z pairs and G-Z wobble pairs. We find G-Z pairs have stability comparable to that of A-T pairs and should therefore be included as base pairs in structure prediction and design algorithms. Additionally, we extrapolated the set of loop, terminal mismatch, and dangling end parameters to include the P and Z nucleotides. These parameters were incorporated into the RNAstructure software package for secondary structure prediction and analysis. Using the RNAstructure Design program, we solved 99 of the 100 design problems posed by Eterna using the ACGT alphabet or supplementing it with P-Z pairs. Extending the alphabet reduced the propensity of sequences to fold into off-target structures, as evaluated by the normalized ensemble defect (NED). The NED values were improved relative to those from the Eterna example solutions in 91 of 99 cases in which Eterna-player solutions were provided. P-Z-containing designs had average NED values of 0.040, significantly below the 0.074 of standard-DNA-only designs, and inclusion of the P-Z pairs decreased the time needed to converge on a design. This work provides a sample pipeline for inclusion of any expanded alphabet nucleotides into prediction and design workflows.
我们证明,通过将碱基配对字母表扩展到 A-T 和 G-C 之外,包括 2-氨基-8-(1'-β-d-2'-去氧核糖呋喃基)-咪唑[1,2]-1,3,5-三嗪-(8)-4-酮和 6-氨基-3-(1'-β-d-2'-去氧核糖呋喃基)-5-硝基-(1)-吡啶-2-酮之间的配对,来设计 DNA 二级结构,可以得到改善,缩写为 和 。为了获得将 P-Z 对纳入设计所需的热力学参数,我们进行了 47 次光学融解实验,并将结果与以前的工作结合起来,以拟合 P-Z 对和 G-Z 摆动对的自由能和焓近邻折叠参数。我们发现 G-Z 对具有与 A-T 对相当的稳定性,因此应该被包括在结构预测和设计算法的碱基对中。此外,我们将环、末端错配和悬垂末端参数外推到包括 P 和 Z 核苷酸。这些参数被纳入 RNAstructure 软件包中,用于二级结构预测和分析。使用 RNAstructure Design 程序,我们解决了 Eterna 使用 ACGT 字母表或用 P-Z 对补充它提出的 100 个设计问题中的 99 个。扩展字母表降低了序列折叠成非目标结构的倾向,这可以通过归一化集合缺陷(NED)来评估。在提供了 Eterna-player 解决方案的 99 个案例中,有 91 个案例的 NED 值得到了改善,相对于 Eterna 示例解决方案的 NED 值。包含 P-Z 的设计的平均 NED 值为 0.040,明显低于标准 DNA 设计的 0.074,并且包含 P-Z 对减少了收敛到设计所需的时间。这项工作提供了一个示例管道,用于将任何扩展字母表核苷酸纳入预测和设计工作流程。