Suppr超能文献

多囊卵巢综合征联合诊断模型的建立与分析:随机森林和人工神经网络方法。

Establishment and Analysis of a Combined Diagnostic Model of Polycystic Ovary Syndrome with Random Forest and Artificial Neural Network.

机构信息

Women's Hospital, School of Medicine, Zhejiang University, Hangzhou 310006, China.

College of Food Science and Biotechnology, Zhejiang Gongshang University, Hangzhou 310018, China.

出版信息

Biomed Res Int. 2020 Aug 20;2020:2613091. doi: 10.1155/2020/2613091. eCollection 2020.

Abstract

Polycystic ovary syndrome (PCOS) is one of the most common metabolic and reproductive endocrinopathies. However, few studies have tried to develop a diagnostic model based on gene biomarkers. In this study, we applied a computational method by combining two machine learning algorithms, including random forest (RF) and artificial neural network (ANN), to identify gene biomarkers and construct diagnostic model. We collected gene expression data from Gene Expression Omnibus (GEO) database containing 76 PCOS samples and 57 normal samples; five datasets were utilized, including one dataset for screening differentially expressed genes (DEGs), two training datasets, and two validation datasets. Firstly, based on RF, 12 key genes in 264 DEGs were identified to be vital for classification of PCOS and normal samples. Moreover, the weights of these key genes were calculated using ANN with microarray and RNA-seq training dataset, respectively. Furthermore, the diagnostic models for two types of datasets were developed and named neuralPCOS. Finally, two validation datasets were used to test and compare the performance of neuralPCOS with other two set of marker genes by area under curve (AUC). Our model achieved an AUC of 0.7273 in microarray dataset, and 0.6488 in RNA-seq dataset. To conclude, we uncovered gene biomarkers and developed a novel diagnostic model of PCOS, which would be helpful for diagnosis.

摘要

多囊卵巢综合征(PCOS)是最常见的代谢和生殖内分泌疾病之一。然而,很少有研究试图基于基因生物标志物来开发诊断模型。在这项研究中,我们应用了一种计算方法,结合了两种机器学习算法,包括随机森林(RF)和人工神经网络(ANN),以识别基因生物标志物并构建诊断模型。我们从基因表达综合数据库(GEO)中收集了包含 76 个 PCOS 样本和 57 个正常样本的基因表达数据;利用了五个数据集,包括一个用于筛选差异表达基因(DEGs)的数据集、两个训练数据集和两个验证数据集。首先,基于 RF,从 264 个 DEGs 中鉴定出 12 个关键基因,这些基因对于 PCOS 和正常样本的分类至关重要。此外,使用 ANN 分别基于微阵列和 RNA-seq 训练数据集计算这些关键基因的权重。进一步地,为两个类型的数据集开发了诊断模型,并分别命名为 neuralPCOS。最后,使用两个验证数据集来测试和比较 neuralPCOS 与其他两组标记基因的 AUC。我们的模型在微阵列数据集和 RNA-seq 数据集的 AUC 分别为 0.7273 和 0.6488。总之,我们发现了基因生物标志物,并开发了一种新的 PCOS 诊断模型,这将有助于诊断。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验