Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA.
Department of Computer Science & Software Engineering, University of Western Australia, Western Australia, Australia.
Nucleic Acids Res. 2023 Apr 24;51(7):e40. doi: 10.1093/nar/gkad097.
An RNA design algorithm takes a target RNA structure and finds a sequence that folds into that structure. This is fundamentally important for engineering therapeutics using RNA. Computational RNA design algorithms are guided by fitness functions, but not much research has been done on the merits of these functions. We survey current RNA design approaches with a particular focus on the fitness functions used. We experimentally compare the most widely used fitness functions in RNA design algorithms on both synthetic and natural sequences. It has been almost 20 years since the last comparison was published, and we find similar results with a major new result: maximizing probability outperforms minimizing ensemble defect. The probability is the likelihood of a structure at equilibrium and the ensemble defect is the weighted average number of incorrect positions in the ensemble. We find that maximizing probability leads to better results on synthetic RNA design puzzles and agrees more often than other fitness functions with natural sequences and structures, which were designed by evolution. Also, we observe that many recently published approaches minimize structure distance to the minimum free energy prediction, which we find to be a poor fitness function.
一种 RNA 设计算法采用目标 RNA 结构并找到一个能折叠成该结构的序列。这对于使用 RNA 进行治疗工程至关重要。计算 RNA 设计算法受适应度函数的指导,但对于这些函数的优点还没有太多研究。我们调查了当前的 RNA 设计方法,特别关注使用的适应度函数。我们在合成和自然序列上对 RNA 设计算法中使用最广泛的适应度函数进行了实验比较。自上次比较发表以来已经过去了近 20 年,我们得到了与主要新结果相似的结果:最大化概率优于最小化整体缺陷。概率是平衡结构的可能性,整体缺陷是整体中错误位置的加权平均值。我们发现,在合成 RNA 设计难题上,最大化概率会产生更好的结果,并且与自然序列和由进化设计的结构相比,它比其他适应度函数更一致。此外,我们观察到许多最近发表的方法最小化结构与最小自由能预测的距离,我们发现这是一个很差的适应度函数。