Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, 02115, USA.
Institute for Medical Engineering and Science and Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
Nat Commun. 2020 Oct 7;11(1):5058. doi: 10.1038/s41467-020-18676-2.
While synthetic biology has revolutionized our approaches to medicine, agriculture, and energy, the design of completely novel biological circuit components beyond naturally-derived templates remains challenging due to poorly understood design rules. Toehold switches, which are programmable nucleic acid sensors, face an analogous design bottleneck; our limited understanding of how sequence impacts functionality often necessitates expensive, time-consuming screens to identify effective switches. Here, we introduce Sequence-based Toehold Optimization and Redesign Model (STORM) and Nucleic-Acid Speech (NuSpeak), two orthogonal and synergistic deep learning architectures to characterize and optimize toeholds. Applying techniques from computer vision and natural language processing, we 'un-box' our models using convolutional filters, attention maps, and in silico mutagenesis. Through transfer-learning, we redesign sub-optimal toehold sensors, even with sparse training data, experimentally validating their improved performance. This work provides sequence-to-function deep learning frameworks for toehold selection and design, augmenting our ability to construct potent biological circuit components and precision diagnostics.
虽然合成生物学彻底改变了我们在医学、农业和能源方面的方法,但由于设计规则不明确,要设计出完全新颖的超越天然模板的生物电路元件仍然具有挑战性。适体开关是可编程核酸传感器,也面临着类似的设计瓶颈;我们对序列如何影响功能的了解有限,这通常需要昂贵且耗时的筛选来确定有效的开关。在这里,我们引入了基于序列的适体结合优化和重新设计模型 (STORM) 和核酸语音 (NuSpeak),这两种正交且协同的深度学习架构可用于描述和优化适体。我们应用计算机视觉和自然语言处理技术,通过卷积滤波器、注意力图和计算机模拟突变来“解包”我们的模型。通过迁移学习,我们重新设计次优适体传感器,即使训练数据稀疏,也能验证其性能得到了改善。这项工作为适体选择和设计提供了序列到功能的深度学习框架,增强了我们构建有效生物电路元件和精准诊断的能力。