Suppr超能文献

基于机器学习识别潜在的新型非酒精性脂肪性肝病生物标志物

Machine Learning-Based Identification of Potentially Novel Non-Alcoholic Fatty Liver Disease Biomarkers.

作者信息

Shafiha Roshan, Bahcivanci Basak, Gkoutos Georgios V, Acharjee Animesh

机构信息

Centre for Computational Biology, Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham B15 2TT, UK.

Institute of Translational Medicine, University of Birmingham, Birmingham B15 2TT, UK.

出版信息

Biomedicines. 2021 Nov 7;9(11):1636. doi: 10.3390/biomedicines9111636.

Abstract

Non-alcoholic fatty liver disease (NAFLD) is a chronic liver disease that presents a great challenge for treatment and prevention.. This study aims to implement a machine learning approach that employs such datasets to identify potential biomarker targets. We developed a pipeline to identify potential biomarkers for NAFLD that includes five major processes, namely, a pre-processing step, a feature selection and a generation of a random forest model and, finally, a downstream feature analysis and a provision of a potential biological interpretation. The pre-processing step includes data normalising and variable extraction accompanied by appropriate annotations. A feature selection based on a differential gene expression analysis is then conducted to identify significant features and then employ them to generate a random forest model whose performance is assessed based on a receiver operating characteristic curve. Next, the features are subjected to a downstream analysis, such as univariate analysis, a pathway enrichment analysis, a network analysis and a generation of correlation plots, boxplots and heatmaps. Once the results are obtained, the biological interpretation and the literature validation is conducted over the identified features and results. We applied this pipeline to transcriptomics and lipidomic datasets and concluded that the C4BPA gene could play a role in the development of NAFLD. The activation of the complement pathway, due to the downregulation of the C4BPA gene, leads to an increase in triglyceride content, which might further render the lipid metabolism. This approach identified the C4BPA gene, an inhibitor of the complement pathway, as a potential biomarker for the development of NAFLD.

摘要

非酒精性脂肪性肝病(NAFLD)是一种慢性肝病,对治疗和预防构成了巨大挑战。本研究旨在实施一种机器学习方法,利用此类数据集来识别潜在的生物标志物靶点。我们开发了一个用于识别NAFLD潜在生物标志物的流程,该流程包括五个主要步骤,即预处理步骤、特征选择、随机森林模型的生成,最后是下游特征分析和潜在生物学解释的提供。预处理步骤包括数据归一化和变量提取以及适当的注释。然后基于差异基因表达分析进行特征选择,以识别显著特征,然后利用这些特征生成一个随机森林模型,其性能基于受试者工作特征曲线进行评估。接下来,对这些特征进行下游分析,如单变量分析、通路富集分析、网络分析以及生成相关图、箱线图和热图。一旦获得结果,就对识别出的特征和结果进行生物学解释和文献验证。我们将此流程应用于转录组学和脂质组学数据集,并得出结论:C4BPA基因可能在NAFLD的发展中起作用。由于C4BPA基因的下调导致补体途径的激活,进而导致甘油三酯含量增加,这可能会进一步影响脂质代谢。这种方法将补体途径的抑制剂C4BPA基因确定为NAFLD发展的潜在生物标志物。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b8e1/8615894/b2613ce7a185/biomedicines-09-01636-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验