Department of Fuel Synthesis, Joint BioEnergy Institute, 5885 Hollis St., Fourth Floor, Emeryville CA 94608, USA.
Nucleic Acids Res. 2010 May;38(8):2607-16. doi: 10.1093/nar/gkq165. Epub 2010 Mar 23.
Generating a defined set of genetic constructs within a large combinatorial space provides a powerful method for engineering novel biological functions. However, the process of assembling more than a few specific DNA sequences can be costly, time consuming and error prone. Even if a correct theoretical construction scheme is developed manually, it is likely to be suboptimal by any number of cost metrics. Modular, robust and formal approaches are needed for exploring these vast design spaces. By automating the design of DNA fabrication schemes using computational algorithms, we can eliminate human error while reducing redundant operations, thus minimizing the time and cost required for conducting biological engineering experiments. Here, we provide algorithms that optimize the simultaneous assembly of a collection of related DNA sequences. We compare our algorithms to an exhaustive search on a small synthetic dataset and our results show that our algorithms can quickly find an optimal solution. Comparison with random search approaches on two real-world datasets show that our algorithms can also quickly find lower-cost solutions for large datasets.
在大型组合空间中生成一组定义明确的遗传结构为工程新型生物功能提供了一种强大的方法。然而,组装多个特定 DNA 序列的过程可能代价高昂、耗时且容易出错。即使手动开发了正确的理论构建方案,也可能在多个成本指标上存在不合理的情况。需要采用模块化、稳健和形式化的方法来探索这些广阔的设计空间。通过使用计算算法自动设计 DNA 制造方案,我们可以在减少冗余操作的同时消除人为错误,从而最大限度地减少进行生物工程实验所需的时间和成本。在这里,我们提供了同时优化一组相关 DNA 序列组装的算法。我们将我们的算法与小的合成数据集上的穷举搜索进行了比较,结果表明我们的算法可以快速找到最优解。在两个真实数据集上与随机搜索方法的比较表明,我们的算法也可以快速找到针对大型数据集的低成本解决方案。