Department of Computer Science, City University of Hong Kong, Kowloon, 999077, Hong Kong.
School of Management Science and Engineering, Dongbei University of Finance and Economics, Dalian 116025, China.
Bioinformatics. 2023 Jul 1;39(7). doi: 10.1093/bioinformatics/btad422.
Chromothripsis, associated with poor clinical outcomes, is prognostically vital in multiple myeloma. The catastrophic event is reported to be detectable prior to the progression of multiple myeloma. As a result, chromothripsis detection can contribute to risk estimation and early treatment guidelines for multiple myeloma patients. However, manual diagnosis remains the gold standard approach to detect chromothripsis events with the whole-genome sequencing technology to retrieve both copy number variation (CNV) and structural variation data. Meanwhile, CNV data are much easier to obtain than structural variation data. Hence, in order to reduce the reliance on human experts' efforts and structural variation data extraction, it is necessary to establish a reliable and accurate chromothripsis detection method based on CNV data.
To address those issues, we propose a method to detect chromothripsis solely based on CNV data. With the help of structure learning, the intrinsic relationship-directed acyclic graph of CNV features is inferred to derive a CNV embedding graph (i.e. CNV-DAG). Subsequently, a neural network based on Graph Transformer, local feature extraction, and non-linear feature interaction, is proposed with the embedding graph as the input to distinguish whether the chromothripsis event occurs. Ablation experiments, clustering, and feature importance analysis are also conducted to enable the proposed model to be explained by capturing mechanistic insights.
The source code and data are freely available at https://github.com/luvyfdawnYu/CNV_chromothripsis.
与不良临床结局相关的染色体重排是多发性骨髓瘤中预后至关重要的因素。据报道,灾难性事件在多发性骨髓瘤进展之前即可检测到。因此,染色体重排的检测有助于多发性骨髓瘤患者的风险评估和早期治疗指南的制定。然而,使用全基因组测序技术检测染色体重排事件并获取拷贝数变异 (CNV) 和结构变异数据仍然是手动诊断这一黄金标准。同时,与结构变异数据相比,CNV 数据更容易获得。因此,为了减少对人类专家的努力和结构变异数据提取的依赖,有必要建立一种基于 CNV 数据的可靠和准确的染色体重排检测方法。
为了解决这些问题,我们提出了一种仅基于 CNV 数据检测染色体重排的方法。借助结构学习,推断出 CNV 特征的内在关系导向无环图,以推导出 CNV 嵌入图(即 CNV-DAG)。随后,提出了一种基于图变换、局部特征提取和非线性特征交互的神经网络,将嵌入图作为输入,以区分是否发生染色体重排事件。还进行了消融实验、聚类和特征重要性分析,以便通过捕获机制见解来解释所提出的模型。
源代码和数据可在 https://github.com/luvyfdawnYu/CNV_chromothripsis 上免费获取。