Department of Biological Engineering, Massachusetts Institute of Technology (MIT), Cambridge, MA, 02139, USA.
Institute for Medical Engineering and Science (IMES), MIT, Cambridge, MA, 02139, USA.
Nat Commun. 2020 Oct 7;11(1):5057. doi: 10.1038/s41467-020-18677-1.
Engineered RNA elements are programmable tools capable of detecting small molecules, proteins, and nucleic acids. Predicting the behavior of these synthetic biology components remains a challenge, a situation that could be addressed through enhanced pattern recognition from deep learning. Here, we investigate Deep Neural Networks (DNN) to predict toehold switch function as a canonical riboswitch model in synthetic biology. To facilitate DNN training, we synthesize and characterize in vivo a dataset of 91,534 toehold switches spanning 23 viral genomes and 906 human transcription factors. DNNs trained on nucleotide sequences outperform (R = 0.43-0.70) previous state-of-the-art thermodynamic and kinetic models (R = 0.04-0.15) and allow for human-understandable attention-visualizations (VIS4Map) to identify success and failure modes. This work shows that deep learning approaches can be used for functionality predictions and insight generation in RNA synthetic biology.
工程 RNA 元件是能够检测小分子、蛋白质和核酸的可编程工具。预测这些合成生物学组件的行为仍然是一个挑战,这种情况可以通过深度学习增强的模式识别来解决。在这里,我们研究了深度神经网络 (DNN),以预测作为合成生物学中典型的核酶模型的臂环开关功能。为了便于 DNN 训练,我们合成并表征了 91534 个臂环开关的体内数据集,这些开关跨越 23 种病毒基因组和 906 种人类转录因子。基于核苷酸序列训练的 DNN 表现优于(R=0.43-0.70)以前的最先进的热力学和动力学模型(R=0.04-0.15),并且允许进行人类可理解的注意力可视化(VIS4Map),以识别成功和失败模式。这项工作表明,深度学习方法可用于 RNA 合成生物学中的功能预测和洞察生成。