Mittal Sneha, Jena Milan Kumar, Pathak Biswarup
Department of Chemistry, Indian Institute of Technology (IIT) Indore Indore Madhya Pradesh 453552 India
Chem Sci. 2024 Jul 8;15(31):12169-12188. doi: 10.1039/d4sc01714e. eCollection 2024 Aug 7.
The pursuit of ultra-rapid, cost-effective, and accurate DNA sequencing is a highly sought after aspect of personalized medicine development. With recent advancements, mainstream machine learning (ML) algorithms hold immense promise for high throughput DNA sequencing at the single nucleotide level. While ML has revolutionized multiple domains of nanoscience and nanotechnology, its implementation in DNA sequencing is still in its preliminary stages. ML-aided DNA sequencing is especially appealing, as ML has the potential to decipher complex patterns and extract knowledge from complex datasets. Herein, we present a holistic framework of ML-aided next-generation DNA sequencing with domain knowledge to set directions toward the development of artificially intelligent DNA sequencers. This perspective focuses on the current state-of-the-art ML-aided DNA sequencing, exploring the opportunities as well as the future challenges in this field. In addition, we provide our personal viewpoints on the critical issues that require attention in the context of ML-aided DNA sequencing.
追求超快速、经济高效且准确的DNA测序是个性化医疗发展中备受关注的一个方面。随着近期的进展,主流机器学习(ML)算法在单核苷酸水平的高通量DNA测序方面具有巨大潜力。虽然ML已经彻底改变了纳米科学和纳米技术的多个领域,但其在DNA测序中的应用仍处于初步阶段。ML辅助的DNA测序尤其具有吸引力,因为ML有潜力解读复杂模式并从复杂数据集中提取知识。在此,我们提出一个具有领域知识的ML辅助下一代DNA测序的整体框架,为人工智能DNA测序仪的发展指明方向。这一观点聚焦于当前ML辅助DNA测序的最新进展,探索该领域的机遇以及未来挑战。此外,我们就ML辅助DNA测序背景下需要关注的关键问题给出了个人观点。