Leeds Institute of Medical Research, Faculty of Medicine and Health, University of Leeds, St James's University Hospital, Beckett Street, Leeds, West Yorkshire, LS9 7TF, UK.
School of Molecular and Cellular Biology, University of Leeds, Leeds, West Yorkshire, LS2 9JT, UK.
Nat Commun. 2021 Nov 4;12(1):6396. doi: 10.1038/s41467-021-26698-7.
Intratumour heterogeneity provides tumours with the ability to adapt and acquire treatment resistance. The development of more effective and personalised treatments for cancers, therefore, requires accurate characterisation of the clonal architecture of tumours, enabling evolutionary dynamics to be tracked. Many methods exist for achieving this from bulk tumour sequencing data, involving identifying mutations and performing subclonal deconvolution, but there is a lack of systematic benchmarking to inform researchers on which are most accurate, and how dataset characteristics impact performance. To address this, we use the most comprehensive tumour genome simulation tool available for such purposes to create 80 bulk tumour whole exome sequencing datasets of differing depths, tumour complexities, and purities, and use these to benchmark subclonal deconvolution pipelines. We conclude that i) tumour complexity does not impact accuracy, ii) increasing either purity or purity-corrected sequencing depth improves accuracy, and iii) the optimal pipeline consists of Mutect2, FACETS and PyClone-VI. We have made our benchmarking datasets publicly available for future use.
肿瘤内异质性使肿瘤具有适应和获得治疗耐药性的能力。因此,为癌症开发更有效和个性化的治疗方法,需要准确描述肿瘤的克隆结构,从而能够跟踪进化动态。从批量肿瘤测序数据中实现这一点有许多方法,包括识别突变和进行亚克隆反卷积,但缺乏系统的基准测试来告知研究人员哪些方法最准确,以及数据集特征如何影响性能。为了解决这个问题,我们使用了最全面的肿瘤基因组模拟工具来创建 80 个不同深度、肿瘤复杂度和纯度的批量肿瘤全外显子组测序数据集,并使用这些数据集来基准测试亚克隆反卷积管道。我们的结论是:i)肿瘤复杂度不影响准确性;ii)增加纯度或纯度校正测序深度可提高准确性;iii)最佳管道由 Mutect2、FACETS 和 PyClone-VI 组成。我们已经公开了我们的基准数据集,以供将来使用。