State Key Laboratory of Physical Chemistry of Solid Surface, Key Laboratory of Chemical Biology of Fujian Province, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, PR China.
School of Informatics, Xiamen University, Xiamen 361005, PR China.
Nucleic Acids Res. 2024 Sep 9;52(16):9407-9416. doi: 10.1093/nar/gkae652.
Precisely modulating the kinetics of toehold-mediated DNA strand displacements (TMSD) is essential for its application in DNA nanotechnology. The sequence in the toehold region significantly influences the kinetics of TMSD. However, due to the large sample space resulting from various arrangements of base sequences and the resulted complex secondary structures, such a correlation is not intuitive. Herein, machine learning was employed to reveal the relationship between the kinetics of TMSD and the toehold sequence as well as the correlated secondary structure of invader strands. Key factors that influence the rate constant of TMSD were identified, such as the number of free hydrogen bonding sites in the invader, the number of free bases in the toehold, and the number of hydrogen bonds in intermediates. Moreover, a predictive model was constructed, which successfully achieved semi-quantitative prediction of rate constants of TMSD even with subtle distinctions in toehold sequence.
精确调控引发链置换(TMSD)的动力学是其在 DNA 纳米技术中应用的关键。在引发链区域的序列会显著影响 TMSD 的动力学。然而,由于碱基序列的各种排列方式导致的样本空间巨大,以及由此产生的复杂二级结构,这种相关性并不直观。在此,我们利用机器学习揭示了 TMSD 的动力学与引发链以及入侵链相关的二级结构之间的关系。确定了影响 TMSD 速率常数的关键因素,例如入侵链中游离氢键的数量、引发链中的游离碱基数量以及中间体中的氢键数量。此外,还构建了一个预测模型,即使在引发链序列存在细微差异的情况下,也能成功地对半定量预测 TMSD 的速率常数。