Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02115, United States.
Broad Institute of MIT and Harvard, Cambridge, MA 02142, United States.
Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae029.
Copy-number variations (CNVs) are common genetic alterations in cancer and their detection may impact tumor classification and therapeutic decisions. However, detection of clinically relevant large and focal CNVs remains challenging when sample material or resources are limited. This has motivated us to create a software tool to infer CNVs from DNA methylation arrays which are often generated as part of clinical routines and in research settings.
We present our R package, conumee 2.0, that combines tangent normalization, an adjustable genomic binning heuristic, and weighted circular binary segmentation to utilize DNA methylation arrays for CNV analysis and mitigate technical biases and batch effects. Segmentation results were validated in a lung squamous cell carcinoma dataset from TCGA (n = 367 samples) by comparison to segmentations derived from genotyping arrays (Pearson's correlation coefficient of 0.91). We further introduce a segmented block bootstrapping approach to detect focal alternations that achieved 60.9% sensitivity and 98.6% specificity for deletions affecting CDKN2A/B (60.0% and 96.9% for RB1, respectively) in a low-grade glioma cohort from TCGA (n = 239 samples). Finally, our tool provides functionality to detect and summarize CNVs across large sample cohorts.
Conumee 2.0 is available under open-source license at: https://github.com/hovestadtlab/conumee2.
拷贝数变异(CNVs)是癌症中常见的遗传改变,其检测可能影响肿瘤分类和治疗决策。然而,当样本材料或资源有限时,检测临床上相关的大型和局灶性 CNVs 仍然具有挑战性。这促使我们开发了一种从 DNA 甲基化阵列推断 CNVs 的软件工具,这些阵列通常作为临床常规和研究设置的一部分生成。
我们展示了我们的 R 包 conumee 2.0,它结合了切线归一化、可调整的基因组分箱启发式和加权循环二进制分割,以利用 DNA 甲基化阵列进行 CNV 分析,并减轻技术偏差和批次效应。在 TCGA 的肺鳞状细胞癌数据集(n=367 个样本)中,通过与来自基因分型阵列的分割进行比较,验证了分割结果(Pearson 相关系数为 0.91)。我们进一步引入了分段块引导方法来检测局灶性改变,在 TCGA 的低级别胶质瘤队列(n=239 个样本)中,针对影响 CDKN2A/B 的缺失(RB1 分别为 60.0%和 96.9%),该方法的检测灵敏度为 60.9%,特异性为 98.6%。最后,我们的工具提供了在大型样本队列中检测和总结 CNVs 的功能。
conumee 2.0 可在以下网址以开源许可证获得:https://github.com/hovestadtlab/conumee2。