Suppr超能文献

子宫颈和子宫体癌的基因表达分析。

Gene Expression Analysis for Uterine Cervix and Corpus Cancer Characterization.

机构信息

Department of Computer Engineering, Automatics and Robotics, C.I.T.I.C., University of Granada, Periodista Rafael Gómez Montero, 2, 18014 Granada, Spain.

出版信息

Genes (Basel). 2024 Feb 28;15(3):312. doi: 10.3390/genes15030312.

Abstract

The analysis of gene expression quantification data is a powerful and widely used approach in cancer research. This work provides new insights into the transcriptomic changes that occur in healthy uterine tissue compared to those in cancerous tissues and explores the differences associated with uterine cancer localizations and histological subtypes. To achieve this, RNA-Seq data from the TCGA database were preprocessed and analyzed using the KnowSeq package. Firstly, a kNN model was applied to classify uterine cervix cancer, uterine corpus cancer, and healthy uterine samples. Through variable selection, a three-gene signature was identified (, , ), achieving consistent 100% test accuracy across 20 repetitions of a 5-fold cross-validation. A supplementary similar analysis using miRNA-Seq data from the same samples identified an optimal two-gene miRNA-coding signature potentially regulating the three-gene signature previously mentioned, which attained optimal classification performance with an 82% F1-macro score. Subsequently, a kNN model was implemented for the classification of cervical cancer samples into their two main histological subtypes (adenocarcinoma and squamous cell carcinoma). A uni-gene signature () was identified, achieving 100% test accuracy through 20 repetitions of a 5-fold cross-validation and externally validated through the CGCI program. Finally, an examination of six cervical adenosquamous carcinoma (mixed) samples revealed a pattern where the gene expression value in the mixed class aligned closer to the histological subtype with lower expression, prompting a reconsideration of the diagnosis for these mixed samples. In summary, this study provides valuable insights into the molecular mechanisms of uterine cervix and corpus cancers. The newly identified gene signatures demonstrate robust predictive capabilities, guiding future research in cancer diagnosis and treatment methodologies.

摘要

基因表达定量数据分析是癌症研究中一种强大且广泛应用的方法。本研究深入探讨了与子宫癌定位和组织学亚型相关的转录组变化,为健康的子宫组织与癌组织之间的转录组变化提供了新的见解。为了实现这一目标,使用 KnowSeq 软件包对 TCGA 数据库中的 RNA-Seq 数据进行了预处理和分析。首先,应用 kNN 模型对子宫颈癌、子宫体癌和健康子宫样本进行分类。通过变量选择,确定了一个由三个基因组成的特征(、、),在 20 次 5 倍交叉验证的重复测试中实现了一致的 100%测试准确性。使用相同样本的 miRNA-Seq 数据进行类似的补充分析,确定了一个潜在调节上述三个基因特征的最优两基因 miRNA 编码特征,该特征的最优分类性能达到 82%的 F1-宏评分。随后,应用 kNN 模型对宫颈癌样本分为两种主要的组织学亚型(腺癌和鳞状细胞癌)进行分类。确定了一个单基因特征(),在 20 次 5 倍交叉验证的重复测试中实现了 100%的测试准确性,并通过 CGCI 程序进行了外部验证。最后,对六个宫颈腺鳞癌(混合)样本的检查表明,混合类的基因表达值更接近表达较低的组织学亚型,这促使我们重新考虑这些混合样本的诊断。总之,本研究为子宫颈和子宫体癌的分子机制提供了有价值的见解。新确定的基因特征显示出强大的预测能力,为癌症诊断和治疗方法的未来研究提供了指导。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ff2/10970626/68559209a97b/genes-15-00312-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验