Ibrahim Taghreed S, Saraya M S, Saleh Ahmed I, Rabie Asmaa H
Computers and Control Dept. faculty of engineering, Mansoura University, Mansoura, Egypt.
Sci Rep. 2025 Apr 1;15(1):11127. doi: 10.1038/s41598-025-93059-5.
Bladder (BL) cancer is the 10th most common cancer worldwide, ranking 9th in males and 13th in females in the United States, respectively. BL cancer is a quick-growing tumor of all cancer forms. Given a malignant tumor's high malignancy, rapid metastasis prediction and accurate treatment are critical. The most significant drivers of the intricate genesis of cancer are complex genetics, including deoxyribonucleic acid (DNA) insertions and deletions, abnormal structure, copy number variations (CNVs), and single nucleotide variations (SNVs). The proposed method enhances the identification of driver genes at the individual patient level by employing attention mechanisms to extract features of both coding and non-coding genes and predict BL cancer based on the personalized driver gene (PDG) detection. The embedded vectors are propagated through the three dense blocks for the binary classification of PDGs. The novel constructure of graph neural network (GNN) with attention mechanism, called Multi Stacked-Layered GAT (MSL-GAT) leverages graph attention mechanisms (GAT) to identify and predict critical driver genes associated with BL cancer progression. In order to pick out and extract essential features from both coding and non-coding genes, including long non-coding RNAs (lncRNAs), which are known to be crucial to the advancement of BL cancer. The approach analyzes key genetic changes (such as SNVs, CNVs, and structural abnormalities) that lead to tumorigenesis and metastasis by concentrating on personalized driver genes (PDGs). The discovery of genes crucial for the survival and proliferation of cancer cells is made possible by the model's precise classification of PDGs. MSL-GAT draws attention to certain lncRNAs and other non-coding elements that control carcinogenic pathways by utilizing the attention mechanism. Tumor development, metastasis, and medication resistance are all facilitated by these lncRNAs, which are frequently overexpressed or dysregulated in BL cancer. In order to reduce the survival of cancer cells, the model's predictions can direct specific treatment approaches, such as RNA interference (RNAi), to mute or suppress the expression of these important genes. MSL-GAT is followed by three dense blocks that spread the embedded vectors to categorize PDGs, making it possible to determine which genes are more likely to cause BL cancer in a certain patient. The model facilitates the identification of new treatment targets by offering a thorough understanding of the molecular landscape of BL cancer through the integration of multi-omics data, encompassing as genomic, transcriptomic, and epigenomic metadata. We compared the novel approach with classical machine learning methods and other deep learning-based methods on benchmark TCGA-BLCA, and the leave-one-out experimental results showed that MSL-GAT achieved better performance than competitive methods. This approach achieves accuracy with 97.72% and improves specificity and sensitivity. It can potentially aid physicians during early prediction of BL cancer.
膀胱癌是全球第10大常见癌症,在美国男性中排名第9,女性中排名第13。膀胱癌是所有癌症类型中生长迅速的肿瘤。鉴于恶性肿瘤的高恶性,快速转移预测和准确治疗至关重要。癌症复杂发生机制的最重要驱动因素是复杂的遗传学,包括脱氧核糖核酸(DNA)插入和缺失、异常结构、拷贝数变异(CNV)和单核苷酸变异(SNV)。所提出的方法通过采用注意力机制提取编码基因和非编码基因的特征,并基于个性化驱动基因(PDG)检测预测膀胱癌,从而在个体患者层面增强驱动基因的识别。嵌入向量通过三个密集块进行传播,以对PDG进行二元分类。具有注意力机制的新型图神经网络(GNN)结构,即多层堆叠图注意力网络(MSL - GAT),利用图注意力机制(GAT)来识别和预测与膀胱癌进展相关的关键驱动基因。为了从编码基因和非编码基因中挑选并提取基本特征,包括已知对膀胱癌进展至关重要的长链非编码RNA(lncRNA)。该方法通过关注个性化驱动基因(PDG)来分析导致肿瘤发生和转移的关键基因变化(如SNV、CNV和结构异常)。通过该模型对PDG的精确分类,能够发现对癌细胞存活和增殖至关重要的基因。MSL - GAT利用注意力机制关注某些lncRNA和其他控制致癌途径的非编码元件。这些lncRNA在膀胱癌中经常过度表达或失调,促进肿瘤发展、转移和耐药性。为了降低癌细胞的存活率,该模型的预测可以指导特定的治疗方法,如RNA干扰(RNAi),以沉默或抑制这些重要基因的表达。MSL - GAT之后是三个密集块,它们传播嵌入向量以对PDG进行分类,从而能够确定在特定患者中哪些基因更有可能导致膀胱癌。该模型通过整合多组学数据,包括基因组、转录组和表观基因组元数据,全面了解膀胱癌的分子格局,有助于识别新的治疗靶点。我们在基准TCGA - BLCA数据集上,将这种新方法与经典机器学习方法和其他基于深度学习的方法进行了比较,留一法实验结果表明,MSL - GAT的性能优于其他竞争方法。该方法的准确率达到97.72%,并提高了特异性和敏感性。它有可能在膀胱癌的早期预测中帮助医生。