Suppr超能文献

解码基因变异:受通信启发的单倍型组装

Decoding Genetic Variations: Communications-Inspired Haplotype Assembly.

作者信息

Puljiz Zrinka, Vikalo Haris

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2016 May-Jun;13(3):518-30. doi: 10.1109/TCBB.2015.2462367.

Abstract

High-throughput DNA sequencing technologies allow fast and affordable sequencing of individual genomes and thus enable unprecedented studies of genetic variations. Information about variations in the genome of an individual is provided by haplotypes, ordered collections of single nucleotide polymorphisms. Knowledge of haplotypes is instrumental in finding genes associated with diseases, drug development, and evolutionary studies. Haplotype assembly from high-throughput sequencing data is challenging due to errors and limited lengths of sequencing reads. The key observation made in this paper is that the minimum error-correction formulation of the haplotype assembly problem is identical to the task of deciphering a coded message received over a noisy channel-a classical problem in the mature field of communication theory. Exploiting this connection, we develop novel haplotype assembly schemes that rely on the bit-flipping and belief propagation algorithms often used in communication systems. The latter algorithm is then adapted to the haplotype assembly of polyploids. We demonstrate on both simulated and experimental data that the proposed algorithms compare favorably with state-of-the-art haplotype assembly methods in terms of accuracy, while being scalable and computationally efficient.

摘要

高通量DNA测序技术能够快速且经济地对个体基因组进行测序,从而使对基因变异进行前所未有的研究成为可能。单倍型(单核苷酸多态性的有序集合)提供了个体基因组变异的信息。单倍型知识对于发现与疾病相关的基因、药物开发以及进化研究具有重要作用。由于测序读段存在错误且长度有限,从高通量测序数据中进行单倍型组装具有挑战性。本文的关键发现是,单倍型组装问题的最小错误校正公式与在有噪声信道上接收的编码消息的解密任务相同——这是通信理论成熟领域中的一个经典问题。利用这种联系,我们开发了新颖的单倍型组装方案,该方案依赖于通信系统中常用的比特翻转和置信传播算法。然后将后一种算法应用于多倍体的单倍型组装。我们在模拟数据和实验数据上均表明,所提出的算法在准确性方面优于当前最先进的单倍型组装方法,同时具有可扩展性和计算效率。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验