Zhejiang Chinese Medical University, Hangzhou, China.
The Second Affiliated Hospital of Zhejiang Chinese Medical University, Hangzhou, China.
Medicine (Baltimore). 2024 May 10;103(19):e38076. doi: 10.1097/MD.0000000000038076.
nonalcoholic fatty liver disease (NAFLD) is a common liver disease affecting the global population and its impact on human health will continue to increase. Genetic susceptibility is an important factor influencing its onset and progression, and there is a lack of reliable methods to predict the susceptibility of normal populations to NAFLD using appropriate genes.
RNA sequencing data relating to nonalcoholic fatty liver disease was analyzed using the "limma" package within the R software. Differentially expressed genes were obtained through preliminary intersection screening. Core genes were analyzed and obtained by establishing and comparing 4 machine learning models, then a prediction model for NAFLD was constructed. The effectiveness of the model was then evaluated, and its applicability and reliability verified. Finally, we conducted further gene correlation analysis, analysis of biological function and analysis of immune infiltration.
By comparing 4 machine learning algorithms, we identified SVM as the optimal model, with the first 6 genes (CD247, S100A9, CSF3R, DIP2C, OXCT 2 and PRAMEF16) as predictive genes. The nomogram was found to have good reliability and effectiveness. Six genes' receiver operating characteristic curves (ROC) suggest an essential role in NAFLD pathogenesis, and they exhibit a high predictive value. Further analysis of immunology demonstrated that these 6 genes were closely connected to various immune cells and pathways.
This study has successfully constructed an advanced and reliable prediction model based on 6 diagnostic gene markers to predict the susceptibility of normal populations to NAFLD, while also providing insights for potential targeted therapies.
非酒精性脂肪性肝病(NAFLD)是一种影响全球人群的常见肝脏疾病,其对人类健康的影响将持续增加。遗传易感性是影响其发病和进展的重要因素,目前缺乏利用合适基因预测普通人群对 NAFLD 易感性的可靠方法。
使用 R 软件中的“limma”包对非酒精性脂肪性肝病的 RNA 测序数据进行分析。通过初步的交集筛选获得差异表达基因。通过建立和比较 4 种机器学习模型来分析和获得核心基因,然后构建 NAFLD 的预测模型。然后评估模型的有效性,并验证其适用性和可靠性。最后,我们进行了进一步的基因相关性分析、生物功能分析和免疫浸润分析。
通过比较 4 种机器学习算法,我们确定 SVM 为最优模型,前 6 个基因(CD247、S100A9、CSF3R、DIP2C、OXCT2 和 PRAMEF16)为预测基因。该列线图具有良好的可靠性和有效性。6 个基因的接收器工作特征曲线(ROC)表明它们在 NAFLD 发病机制中具有重要作用,具有较高的预测价值。进一步的免疫学分析表明,这 6 个基因与各种免疫细胞和途径密切相关。
本研究成功构建了基于 6 个诊断基因标志物的先进且可靠的预测模型,用于预测普通人群对 NAFLD 的易感性,同时为潜在的靶向治疗提供了思路。