Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA.
Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
Nat Biotechnol. 2022 Sep;40(9):1332-1335. doi: 10.1038/s41587-022-01261-x. Epub 2022 Mar 24.
Routine haplotype-resolved genome assembly from single samples remains an unresolved problem. Here we describe an algorithm that combines PacBio HiFi reads and Hi-C chromatin interaction data to produce a haplotype-resolved assembly without the sequencing of parents. Applied to human and other vertebrate samples, our algorithm consistently outperforms existing single-sample assembly pipelines and generates assemblies of similar quality to the best pedigree-based assemblies.
从单个样本中进行常规的单倍型解析基因组组装仍然是一个未解决的问题。在这里,我们描述了一种算法,该算法结合了 PacBio HiFi 读取和 Hi-C 染色质相互作用数据,无需对父母进行测序即可生成单倍型解析组装。将我们的算法应用于人类和其他脊椎动物样本,它始终优于现有的单样本组装管道,并生成与基于最佳系谱的组装质量相当的组装。