Department of Computer and Systems Sciences, Stockholm University, Stockholm, Sweden; Science for Life Laboratory, Stockholm, Sweden.
Department of Mathematics, Qazvin Branch, Islamic Azad University, Qazvin, Iran.
Comput Biol Med. 2024 Mar;171:108234. doi: 10.1016/j.compbiomed.2024.108234. Epub 2024 Feb 29.
Breast cancer has become a severe public health concern and one of the leading causes of cancer-related death in women worldwide. Several genes and mutations in these genes linked to breast cancer have been identified using sophisticated techniques, despite the fact that the exact cause of breast cancer is still unknown. A commonly used feature for identifying driver mutations is the recurrence of a mutation in patients. Nevertheless, some mutations are more likely to occur than others for various reasons. Sequencing analysis has shown that cancer-driving genes operate across complex networks, often with mutations appearing in a modular pattern. In this work, as a retrospective study, we used TCGA data, which is gathered from breast cancer patients. We introduced a new machine-learning approach to examine gene functionality in networks derived from mutation associations, gene-gene interactions, and graph clustering for breast cancer analysis. These networks have uncovered crucial biological components in critical pathways, particularly those that exhibit low-frequency mutations. The statistical strength of the clinical study is significantly boosted by evaluating the network as a whole instead of just single gene effects. Our method successfully identified essential driver genes with diverse mutation frequencies. We then explored the functions of these potential driver genes and their related pathways. By uncovering low-frequency genes, we shed light on understudied pathways associated with breast cancer. Additionally, we present a novel Monte Carlo-based algorithm to identify driver modules in breast cancer. Our findings highlight the significance and role of these modules in critical signaling pathways in breast cancer, providing a comprehensive understanding of breast cancer development. Materials and implementations are available at: [https://github.com/MahnazHabibi/BreastCancer].
乳腺癌已成为严重的公共卫生问题,也是全球女性癌症相关死亡的主要原因之一。尽管乳腺癌的确切病因仍不清楚,但已经使用复杂的技术鉴定出与乳腺癌相关的一些基因和基因突变。识别驱动突变的常用特征是突变在患者中的重现。然而,由于各种原因,某些突变比其他突变更有可能发生。测序分析表明,癌症驱动基因在复杂的网络中运作,基因突变通常以模块化模式出现。在这项工作中,作为一项回顾性研究,我们使用了从乳腺癌患者中收集的 TCGA 数据。我们引入了一种新的机器学习方法,用于检查来自突变关联、基因-基因相互作用和图聚类的网络中的基因功能,以进行乳腺癌分析。这些网络揭示了关键途径中的关键生物学成分,特别是那些表现出低频突变的途径。通过评估整个网络而不仅仅是单个基因的影响,大大提高了临床研究的统计强度。我们的方法成功地识别出具有不同突变频率的必需驱动基因。然后,我们探索了这些潜在驱动基因及其相关途径的功能。通过揭示低频基因,我们揭示了与乳腺癌相关的研究不足的途径。此外,我们提出了一种新的基于蒙特卡罗的算法来识别乳腺癌中的驱动模块。我们的研究结果强调了这些模块在乳腺癌关键信号通路中的重要性和作用,提供了对乳腺癌发展的全面理解。材料和实现可在[https://github.com/MahnazHabibi/BreastCancer]获得。