Liu Yining, Zhao Min, Qu Hong
The School of Public Health, Institute for Chemical Carcinogenesis, Guangzhou Medical University, Guangzhou 510180, China.
School of Science, Technology and Engineering, University of the Sunshine Coast, Maroochydore, QLD 4558, Australia.
Biology (Basel). 2023 Feb 24;12(3):357. doi: 10.3390/biology12030357.
The molecular subtype is critical for accurate treatment and follow-up in patients with lung cancer; however, information regarding subtype-associated genes is dispersed among thousands of published studies. Systematic curation and cross-validation of the scientific literature would provide a solid foundation for comparative genetic studies of the major molecular subtypes of lung cancer. Here, we constructed a literature-based lung cancer gene database (LCGene). In the current release, we collected and curated 2507 unique human genes, including 2267 protein-coding and 240 non-coding genes from comprehensive manual examination of 10,960 PubMed article abstracts. Extensive annotations were added to aid identification of differentially expressed genes, potential gene editing sites, and non-coding gene regulation. For instance, we prepared 607 curated genes with CRISPR knockout information in 43 lung cancer cell lines. Further comparison of these implicated genes among different subtypes identified several subtype-specific genes with high mutational frequencies. Common tumor suppressors and oncogenes shared by lung adenocarcinoma and lung squamous cell carcinoma, for example, exhibited different mutational frequencies and prognostic features, suggesting the presence of subtype-specific biomarkers. Our retrospective analysis revealed 43 small cell lung cancer-specific genes. Moreover, 52 tumor suppressors and oncogenes shared by lung adenocarcinoma and squamous cell carcinoma confirmed the different molecular mechanisms of these two cancer subtypes. The subtype-based genetic differences, when combined, may provide insight into subtype-specific biomarkers for genetic testing.
分子亚型对于肺癌患者的精准治疗和随访至关重要;然而,有关亚型相关基因的信息分散在数千项已发表的研究中。对科学文献进行系统整理和交叉验证将为肺癌主要分子亚型的比较遗传学研究提供坚实基础。在此,我们构建了一个基于文献的肺癌基因数据库(LCGene)。在当前版本中,我们通过对10960篇PubMed文章摘要进行全面人工审查,收集并整理了2507个独特的人类基因,包括2267个蛋白质编码基因和240个非编码基因。添加了大量注释以帮助识别差异表达基因、潜在基因编辑位点和非编码基因调控。例如,我们准备了607个在43种肺癌细胞系中具有CRISPR敲除信息的整理基因。对这些涉及的基因在不同亚型之间进行进一步比较,确定了几个具有高突变频率的亚型特异性基因。例如,肺腺癌和肺鳞状细胞癌共有的常见肿瘤抑制基因和癌基因表现出不同的突变频率和预后特征,表明存在亚型特异性生物标志物。我们的回顾性分析揭示了43个小细胞肺癌特异性基因。此外,肺腺癌和鳞状细胞癌共有的52个肿瘤抑制基因和癌基因证实了这两种癌症亚型的不同分子机制。基于亚型的遗传差异结合起来,可能为基因检测的亚型特异性生物标志物提供见解。