Myers Matthew A, Arnold Brian J, Bansal Vineet, Mullen Katelyn M, Zaccaria Simone, Raphael Benjamin J
Department of Computer Science, Princeton University, Princeton, USA.
Center for Statistics and Machine Learning, Princeton University, Princeton, USA.
bioRxiv. 2023 Jul 15:2023.07.13.548855. doi: 10.1101/2023.07.13.548855.
Multi-region DNA sequencing of primary tumors and metastases from individual patients helps identify somatic aberrations driving cancer development. However, most methods to infer copy-number aberrations (CNAs) analyze individual samples. We introduce HATCHet2 to identify haplotype- and clone-specific CNAs simultaneously from multiple bulk samples. HATCHet2 introduces a novel statistic, the mirrored haplotype B-allele frequency (mhBAF), to identify mirrored-subclonal CNAs having different numbers of copies of parental haplotypes in different tumor clones. HATCHet2 also has high accuracy in identifying focal CNAs and extends the earlier HATCHet method in several directions. We demonstrate HATCHet2's improved accuracy using simulations and a single-cell sequencing dataset. HATCHet2 analysis of 50 prostate cancer samples from 10 patients reveals previously-unreported mirrored-subclonal CNAs affecting cancer genes.
对个体患者的原发性肿瘤和转移灶进行多区域DNA测序有助于识别驱动癌症发展的体细胞畸变。然而,大多数推断拷贝数畸变(CNA)的方法都只分析单个样本。我们引入了HATCHet2,以便从多个批量样本中同时识别单倍型特异性和克隆特异性CNA。HATCHet2引入了一种新的统计量,即镜像单倍型B等位基因频率(mhBAF),以识别在不同肿瘤克隆中具有不同数量亲本单倍型拷贝的镜像亚克隆CNA。HATCHet2在识别局灶性CNA方面也具有很高的准确性,并在几个方面扩展了早期的HATCHet方法。我们通过模拟和单细胞测序数据集证明了HATCHet2的准确性有所提高。对来自10名患者的50个前列腺癌样本进行HATCHet2分析,发现了影响癌症基因的此前未报告的镜像亚克隆CNA。