Henglin Mir, Ghareghani Maryam, Harvey William, Porubsky David, Koren Sergey, Eichler Evan E, Ebert Peter, Marschall Tobias
Institute for Medical Biometry and Bioinformatics, Medical Faculty and University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Germany.
Center for Digital Medicine, Heinrich Heine University Düsseldorf, Germany.
bioRxiv. 2024 Jun 20:2024.02.15.580432. doi: 10.1101/2024.02.15.580432.
Haplotype information is crucial for biomedical and population genetics research. However, current strategies to produce haplotype-resolved assemblies often require either difficult-to-acquire parental data or an intermediate haplotype-collapsed assembly. Here, we present Graphasing, a workflow which synthesizes the global phase signal of Strand-seq with assembly graph topology to produce chromosome-scale haplotypes for diploid genomes. Graphasing readily integrates with any assembly workflow that both outputs an assembly graph and has a haplotype assembly mode. Graphasing performs comparably to trio-phasing in contiguity, phasing accuracy, and assembly quality, outperforms Hi-C in phasing accuracy, and generates human assemblies with over 18 chromosome-spanning haplotypes.
单倍型信息对于生物医学和群体遗传学研究至关重要。然而,当前生成单倍型解析装配体的策略通常需要难以获取的亲本数据或中间单倍型折叠装配体。在此,我们提出了Graphasing,这是一种将Strand-seq的全局相位信号与装配图拓扑结构相结合的工作流程,用于生成二倍体基因组的染色体规模单倍型。Graphasing可以轻松地与任何既输出装配图又具有单倍型装配模式的装配工作流程集成。Graphasing在连续性、相位准确性和装配质量方面与三联体相位分析相当,在相位准确性方面优于Hi-C,并生成具有超过18个跨染色体单倍型的人类装配体。