Suppr超能文献

利用机器学习识别酒精性肝病的蛋白质组学和转录组学综合特征。

Identification of integrated proteomics and transcriptomics signature of alcohol-associated liver disease using machine learning.

作者信息

Listopad Stanislav, Magnan Christophe, Day Le Z, Asghar Aliya, Stolz Andrew, Tayek John A, Liu Zhang-Xu, Jacobs Jon M, Morgan Timothy R, Norden-Krichmar Trina M

机构信息

Department of Computer Science, University of California, Irvine, California, United States of America.

Biological Sciences Division and Environmental and Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, Washington, United States of America.

出版信息

PLOS Digit Health. 2024 Feb 9;3(2):e0000447. doi: 10.1371/journal.pdig.0000447. eCollection 2024 Feb.

Abstract

Distinguishing between alcohol-associated hepatitis (AH) and alcohol-associated cirrhosis (AC) remains a diagnostic challenge. In this study, we used machine learning with transcriptomics and proteomics data from liver tissue and peripheral mononuclear blood cells (PBMCs) to classify patients with alcohol-associated liver disease. The conditions in the study were AH, AC, and healthy controls. We processed 98 PBMC RNAseq samples, 55 PBMC proteomic samples, 48 liver RNAseq samples, and 53 liver proteomic samples. First, we built separate classification and feature selection pipelines for transcriptomics and proteomics data. The liver tissue models were validated in independent liver tissue datasets. Next, we built integrated gene and protein expression models that allowed us to identify combined gene-protein biomarker panels. For liver tissue, we attained 90% nested-cross validation accuracy in our dataset and 82% accuracy in the independent validation dataset using transcriptomic data. We attained 100% nested-cross validation accuracy in our dataset and 61% accuracy in the independent validation dataset using proteomic data. For PBMCs, we attained 83% and 89% accuracy with transcriptomic and proteomic data, respectively. The integration of the two data types resulted in improved classification accuracy for PBMCs, but not liver tissue. We also identified the following gene-protein matches within the gene-protein biomarker panels: CLEC4M-CLC4M, GSTA1-GSTA2 for liver tissue and SELENBP1-SBP1 for PBMCs. In this study, machine learning models had high classification accuracy for both transcriptomics and proteomics data, across liver tissue and PBMCs. The integration of transcriptomics and proteomics into a multi-omics model yielded improvement in classification accuracy for the PBMC data. The set of integrated gene-protein biomarkers for PBMCs show promise toward developing a liquid biopsy for alcohol-associated liver disease.

摘要

区分酒精性肝炎(AH)和酒精性肝硬化(AC)仍然是一项诊断挑战。在本研究中,我们运用机器学习方法,结合来自肝脏组织和外周血单个核细胞(PBMCs)的转录组学和蛋白质组学数据,对酒精性肝病患者进行分类。研究中的病症包括AH、AC和健康对照。我们处理了98个PBMC RNA测序样本、55个PBMC蛋白质组学样本、48个肝脏RNA测序样本和53个肝脏蛋白质组学样本。首先,我们为转录组学和蛋白质组学数据构建了单独的分类和特征选择流程。肝脏组织模型在独立的肝脏组织数据集中得到验证。接下来,我们构建了整合基因和蛋白质表达的模型,使我们能够识别联合的基因 - 蛋白质生物标志物组。对于肝脏组织,使用转录组学数据时,我们在数据集中获得了90%的嵌套交叉验证准确率,在独立验证数据集中获得了82%的准确率。使用蛋白质组学数据时,我们在数据集中获得了100%的嵌套交叉验证准确率,在独立验证数据集中获得了61%的准确率。对于PBMCs,使用转录组学和蛋白质组学数据时,我们分别获得了83%和89%的准确率。两种数据类型的整合提高了PBMCs的分类准确率,但对肝脏组织没有效果。我们还在基因 - 蛋白质生物标志物组中确定了以下基因 - 蛋白质匹配:肝脏组织中的CLEC4M - CLC4M、GSTA1 - GSTA2以及PBMCs中的SELENBP1 - SBP1。在本研究中,机器学习模型对肝脏组织和PBMCs的转录组学和蛋白质组学数据均具有较高的分类准确率。将转录组学和蛋白质组学整合到多组学模型中提高了PBMC数据的分类准确率。PBMCs的整合基因 - 蛋白质生物标志物组有望用于开发酒精性肝病的液体活检技术。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5f0/10857706/1707a79f508a/pdig.0000447.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验