Anthropology and Health Informatics Laboratory, Department of Bioinformatics, Bharathiar University, Coimbatore, Tamil Nadu, India.
Mamm Genome. 2024 Dec;35(4):683-710. doi: 10.1007/s00335-024-10060-5. Epub 2024 Aug 17.
Prostate cancer (PCa) ranks as the second leading cause of cancer-related deaths in men. Diagnosing PCa relies on molecular markers known as diagnostic biomarkers, while prognostic biomarkers are used to identify key proteins involved in PCa treatments. This study aims to gather PCa-associated genes and assess their potential as either diagnostic or prognostic biomarkers for PCa. A corpus of 152,064 PCa-related data from PubMed, spanning from May 1936 to December 2020, was compiled. Additionally, 4199 genes associated with PCa terms were collected from the National Center of Biotechnology Information (NCBI) database. The PubMed corpus data was extracted using pubmed.mineR to identify PCa-associated genes. Network and pathway analyses were conducted using various tools, such as STRING, DAVID, KEGG, MCODE 2.0, cytoHubba app, CluePedia, and ClueGO app. Significant marker genes were identified using Random Forest, Support Vector Machines, Neural Network algorithms, and the Cox Proportional Hazard model. This study reports 3062 unique PCa-associated genes along with 2518 corresponding unique PMIDs. Diagnostic markers such as IL6, MAPK3, JUN, FOS, ACTB, MYC, and TGFB1 were identified, while prognostic markers like ACTB and HDAC1 were highlighted in PubMed. This suggests that the potential target genes provided by PubMed data outweigh those in the NCBI database.
前列腺癌(PCa)是男性癌症相关死亡的第二大主要原因。PCa 的诊断依赖于称为诊断生物标志物的分子标记物,而预后生物标志物则用于鉴定参与 PCa 治疗的关键蛋白。本研究旨在收集与 PCa 相关的基因,并评估它们作为 PCa 的诊断或预后生物标志物的潜力。从 1936 年 5 月到 2020 年 12 月,从 PubMed 中收集了一个包含 152064 个 PCa 相关数据的语料库。此外,还从国家生物技术信息中心(NCBI)数据库中收集了与 PCa 术语相关的 4199 个基因。使用 pubmed.mineR 从 PubMed 语料库数据中提取与 PCa 相关的基因。使用各种工具(如 STRING、DAVID、KEGG、MCODE 2.0、cytoHubba app、CluePedia 和 ClueGO app)进行网络和途径分析。使用随机森林、支持向量机、神经网络算法和 Cox 比例风险模型识别显著的标记基因。本研究报告了 3062 个独特的与 PCa 相关的基因,以及 2518 个对应的唯一 PMID。鉴定出了 IL6、MAPK3、JUN、FOS、ACTB、MYC 和 TGFB1 等诊断标志物,而 ACTB 和 HDAC1 等预后标志物在 PubMed 中也得到了强调。这表明 PubMed 数据提供的潜在靶基因超过了 NCBI 数据库中的靶基因。