Department of Electrical and Computer Engineering, Democritus University of Thrace, Xanthi 67100 Greece.
Department of Electrical and Computer Engineering, Democritus University of Thrace, Xanthi 67100 Greece; National Centre for Scientific Research Demokritos, Athens 15342 Greece.
Comput Biol Chem. 2023 Dec;107:107959. doi: 10.1016/j.compbiolchem.2023.107959. Epub 2023 Sep 14.
Reference-guided DNA sequencing and alignment is an important process in computational molecular biology. The amount of DNA data grows very fast, and many new genomes are waiting to be sequenced while millions of private genomes need to be re-sequenced. Each human genome has 3.2B base pairs, and each one could be stored with 2 bits of information, so one human genome would take 6.4B bits or ∼760MB of storage (National Institute of General Medical Sciences, n.d.). Today's most powerful tensor processing units cannot handle the volume of DNA data necessitating a major leap in computing power. It is, therefore, important to investigate the usefulness of quantum computers in genomic data analysis, especially in DNA sequence alignment. Quantum computers are expected to be involved in DNA sequencing, initially as parts of classical systems, acting as quantum accelerators. The number of available qubits is increasing annually, and future quantum computers could conduct DNA sequencing, taking the place of classical computing systems. We present a novel quantum algorithm for reference-guided DNA sequence alignment modeled with gate-based quantum computing. The algorithm is scalable, can be integrated into existing classical DNA sequencing systems and is intentionally structured to limit computational errors. The quantum algorithm has been tested using the quantum processing units and simulators provided by IBM Quantum, and its correctness has been confirmed.
参考指导的 DNA 测序和比对是计算分子生物学中的一个重要过程。DNA 数据的数量增长非常快,许多新的基因组等待测序,而数以百万计的私人基因组需要重新测序。每个人类基因组有 32 亿个碱基对,每个碱基对可以用 2 位信息存储,因此一个人类基因组需要 64 亿位或约 7.6GB 的存储空间(美国国立卫生研究院综合医学科学研究所,未注明日期)。当今最强大的张量处理单元无法处理如此庞大的 DNA 数据量,这就需要计算能力的重大飞跃。因此,研究量子计算机在基因组数据分析中的实用性非常重要,特别是在 DNA 序列比对方面。量子计算机有望参与 DNA 测序,最初作为经典系统的一部分,充当量子加速器。可用量子比特的数量每年都在增加,未来的量子计算机可以进行 DNA 测序,取代经典计算系统。我们提出了一种新的基于门控量子计算的参考指导 DNA 序列比对的量子算法。该算法具有可扩展性,可以集成到现有的经典 DNA 测序系统中,并有意设计成限制计算错误。该量子算法已经使用 IBM Quantum 提供的量子处理单元和模拟器进行了测试,并已确认其正确性。