Division of Genome Analysis Platform Development, National Cancer Center Research Institute, Tokyo, Japan.
Division of Molecular Oncology, National Cancer Center Research Institute, Tokyo, Japan.
Nucleic Acids Res. 2023 Aug 11;51(14):e74. doi: 10.1093/nar/gkad526.
We present our novel software, nanomonsv, for detecting somatic structural variations (SVs) using tumor and matched control long-read sequencing data with a single-base resolution. The current version of nanomonsv includes two detection modules, Canonical SV module, and Single breakend SV module. Using tumor/control paired long-read sequencing data from three cancer and their matched lymphoblastoid lines, we demonstrate that Canonical SV module can identify somatic SVs that can be captured by short-read technologies with higher precision and recall than existing methods. In addition, we have developed a workflow to classify mobile element insertions while elucidating their in-depth properties, such as 5' truncations, internal inversions, as well as source sites for 3' transductions. Furthermore, Single breakend SV module enables the detection of complex SVs that can only be identified by long-reads, such as SVs involving highly-repetitive centromeric sequences, and LINE1- and virus-mediated rearrangements. In summary, our approaches applied to cancer long-read sequencing data can reveal various features of somatic SVs and will lead to a better understanding of mutational processes and functional consequences of somatic SVs.
我们提出了一种新的软件 nanomonsv,用于使用具有单碱基分辨率的肿瘤和匹配对照长读测序数据检测体细胞结构变异 (SVs)。当前版本的 nanomonsv 包括两个检测模块,即Canonical SV 模块和单断点 SV 模块。使用来自三种癌症及其匹配的淋巴母细胞系的肿瘤/对照配对长读测序数据,我们证明了 Canonical SV 模块可以识别可以被短读技术捕获的体细胞 SVs,其精度和召回率均高于现有方法。此外,我们还开发了一种工作流程来分类移动元件插入物,同时阐明其深入的特性,如 5' 截断、内部反转以及 3' 转导的源位点。此外,单断点 SV 模块能够检测只能通过长读才能识别的复杂 SVs,例如涉及高度重复的着丝粒序列的 SVs,以及 LINE1 和病毒介导的重排。总之,我们应用于癌症长读测序数据的方法可以揭示体细胞 SVs 的各种特征,并有助于更好地理解体细胞 SVs 的突变过程和功能后果。