Institute for Medical Biometry and Bioinformatics, Medical Faculty and University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Düsseldorf, Germany.
Center for Digital Medicine, Heinrich Heine University Düsseldorf, Düsseldorf, Germany.
Genome Biol. 2024 Oct 10;25(1):265. doi: 10.1186/s13059-024-03409-1.
Haplotype information is crucial for biomedical and population genetics research. However, current strategies to produce de novo haplotype-resolved assemblies often require either difficult-to-acquire parental data or an intermediate haplotype-collapsed assembly. Here, we present Graphasing, a workflow which synthesizes the global phase signal of Strand-seq with assembly graph topology to produce chromosome-scale de novo haplotypes for diploid genomes. Graphasing readily integrates with any assembly workflow that both outputs an assembly graph and has a haplotype assembly mode. Graphasing performs comparably to trio phasing in contiguity, phasing accuracy, and assembly quality, outperforms Hi-C in phasing accuracy, and generates human assemblies with over 18 chromosome-spanning haplotypes.
单体型信息对于生物医学和群体遗传学研究至关重要。然而,目前产生从头单体型解析组装的策略通常需要难以获得的亲本数据或中间单体型折叠组装。在这里,我们提出了 Graphasing,这是一种利用 Strand-seq 的全局相位信号与组装图拓扑相结合的工作流程,可针对二倍体基因组生成染色体尺度的从头单体型。Graphasing 可以轻松集成到任何具有组装图输出和单体型组装模式的组装工作流程中。Graphasing 在连续性、相位准确性和组装质量方面与 trio 相位相当,在相位准确性方面优于 Hi-C,并生成具有超过 18 个染色体跨度单体型的人类组装。