Dipartimento Interateneo di Fisica M. Merlin, Università degli Studi di Bari Aldo Moro, Via G. Amendola 173, 70125 Bari, Italy.
Sezione di Bari, Istituto Nazionale di Fisica Nucleare (INFN), Via A. Orabona 4, 70125 Bari, Italy.
Int J Mol Sci. 2023 Oct 18;24(20):15286. doi: 10.3390/ijms242015286.
Hepatocellular carcinoma (HCC) is one of the most common cancers worldwide, and the number of cases is constantly increasing. Early and accurate HCC diagnosis is crucial to improving the effectiveness of treatment. The aim of the study is to develop a supervised learning framework based on hierarchical community detection and artificial intelligence in order to classify patients and controls using publicly available microarray data. With our methodology, we identified 20 gene communities that discriminated between healthy and cancerous samples, with an accuracy exceeding 90%. We validated the performance of these communities on an independent dataset, and with two of them, we reached an accuracy exceeding 80%. Then, we focused on two communities, selected because they were enriched with relevant biological functions, and on these we applied an explainable artificial intelligence (XAI) approach to analyze the contribution of each gene to the classification task. In conclusion, the proposed framework provides an effective methodological and quantitative tool helping to find gene communities, which may uncover pivotal mechanisms responsible for HCC and thus discover new biomarkers.
肝细胞癌 (HCC) 是全球最常见的癌症之一,其病例数量不断增加。早期、准确的 HCC 诊断对于提高治疗效果至关重要。本研究旨在开发一种基于层次社区检测和人工智能的监督学习框架,以便使用公开的微阵列数据对患者和对照进行分类。通过我们的方法,我们确定了 20 个基因社区,这些社区可以区分健康和癌症样本,准确率超过 90%。我们在一个独立的数据集中验证了这些社区的性能,其中两个社区的准确率超过 80%。然后,我们专注于两个社区,选择它们是因为它们富含相关的生物学功能,我们在这些社区上应用了可解释人工智能 (XAI) 方法来分析每个基因对分类任务的贡献。总之,所提出的框架提供了一种有效的方法和定量工具,有助于找到可能揭示 HCC 关键机制的基因社区,并发现新的生物标志物。