Lai Jiaying, Liu Yunzhou, Scharpf Robert B, Karchin Rachel
Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD.
Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD.
ArXiv. 2024 Feb 14:arXiv:2402.09599v1.
Most neoplastic tumors originate from a single cell, and their evolution can be genetically traced through lineages characterized by common alterations such as small somatic mutations (SSMs), copy number alterations (CNAs), structural variants (SVs), and aneuploidies. Due to the complexity of these alterations in most tumors and the errors introduced by sequencing protocols and calling algorithms, tumor subclonal reconstruction algorithms are necessary to recapitulate the DNA sequence composition and tumor evolution With a growing number of these algorithms available, there is a pressing need for consistent and comprehensive benchmarking, which relies on realistic tumor sequencing generated by simulation tools. Here, we examine the current simulation methods, identifying their strengths and weaknesses, and provide recommendations for their improvement. Our review also explores potential new directions for research in this area. This work aims to serve as a resource for understanding and enhancing tumor genomic simulations, contributing to the advancement of the field.
大多数肿瘤起源于单个细胞,其进化过程可以通过具有共同改变(如体细胞小突变(SSMs)、拷贝数改变(CNAs)、结构变异(SVs)和非整倍体)特征的谱系进行基因追踪。由于大多数肿瘤中这些改变的复杂性以及测序方案和识别算法引入的错误,肿瘤亚克隆重建算法对于概括DNA序列组成和肿瘤进化是必要的。随着可用的此类算法数量不断增加,迫切需要进行一致且全面的基准测试,这依赖于由模拟工具生成的真实肿瘤测序。在此,我们研究了当前的模拟方法,确定了它们的优缺点,并为其改进提供了建议。我们的综述还探索了该领域潜在的新研究方向。这项工作旨在作为理解和增强肿瘤基因组模拟的资源,为该领域的发展做出贡献。