Suppr超能文献

WGCNA 联合机器学习寻找肝癌的潜在生物标志物。

WGCNA combined with machine learning to find potential biomarkers of liver cancer.

机构信息

Key Laboratory of Basic and Application Research of Beiyao, Heilongjiang University of Chinese Medicine, Ministry of Education, Harbin, China.

出版信息

Medicine (Baltimore). 2023 Dec 15;102(50):e36536. doi: 10.1097/MD.0000000000036536.

Abstract

The incidence of hepatocellular carcinoma (HCC) has been increasing in recent years. With the development of various detection technologies, machine learning is an effective method to screen disease characteristic genes. In this study, weighted gene co-expression network analysis (WGCNA) and machine learning are combined to find potential biomarkers of liver cancer, which provides a new idea for future prediction, prevention, and personalized treatment. In this study, the "limma" software package was used. P < .05 and log2 |fold-change| > 1 is the standard screening differential genes, and then the module genes obtained by WGCNA analysis are crossed to obtain the key module genes. Gene Ontology and Kyoto Gene and Genome Encyclopedia analysis was performed on key module genes, and 3 machine learning methods including lasso, support vector machine-recursive feature elimination, and RandomForest were used to screen feature genes. Finally, the validation set was used to verify the feature genes, the GeneMANIA (http://www.genemania.org) database was used to perform protein-protein interaction networks analysis on the feature genes, and the SPIED3 database was used to find potential small molecule drugs. In this study, 187 genes associated with HCC were screened by using the "limma" software package and WGCNA. After that, 6 feature genes (AADAT, APOF, GPC3, LPA, MASP1, and NAT2) were selected by RandomForest, Absolute Shrinkage and Selection Operator, and support vector machine-recursive feature elimination machine learning algorithms. These genes are also significantly different on the external dataset and follow the same trend as the training set. Finally, our findings may provide new insights into targets for diagnosis, prevention, and treatment of HCC. AADAT, APOF, GPC3, LPA, MASP1, and NAT2 may be potential genes for the prediction, prevention, and treatment of liver cancer in the future.

摘要

近年来,肝细胞癌(HCC)的发病率一直在上升。随着各种检测技术的发展,机器学习是筛选疾病特征基因的有效方法。本研究将加权基因共表达网络分析(WGCNA)与机器学习相结合,寻找肝癌的潜在生物标志物,为未来的预测、预防和个体化治疗提供了新的思路。本研究采用“limma”软件包。P 值<0.05 和 log2|fold-change|>1 是筛选差异基因的标准,然后对 WGCNA 分析得到的模块基因进行交叉,得到关键模块基因。对关键模块基因进行基因本体论和京都基因和基因组百科全书分析,使用lasso、支持向量机递归特征消除和随机森林 3 种机器学习方法筛选特征基因。最后,使用验证集验证特征基因,使用 GeneMANIA(http://www.genemania.org)数据库对特征基因进行蛋白质-蛋白质相互作用网络分析,使用 SPIED3 数据库寻找潜在的小分子药物。本研究使用“limma”软件包和 WGCNA 筛选出 187 个与 HCC 相关的基因。之后,通过随机森林、绝对收缩和选择算子、支持向量机递归特征消除机器学习算法筛选出 6 个特征基因(AADAT、APOF、GPC3、LPA、MASP1 和 NAT2)。这些基因在外部数据集上也有显著差异,且与训练集呈现相同的趋势。最后,我们的研究结果可能为 HCC 的诊断、预防和治疗提供新的靶点。AADAT、APOF、GPC3、LPA、MASP1 和 NAT2 可能是未来肝癌预测、预防和治疗的潜在基因。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5149/10727608/d1ef8c8de2e4/medi-102-e36536-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验