Patel Maitray A, Daley Mark, Van Nynatten Logan R, Slessarev Marat, Cepinskas Gediminas, Fraser Douglas D
Epidemiology and Biostatistics, Western University, London, ON, N6A 3K7, Canada.
Computer Science, Western University, London, ON, N6A 3K7, Canada.
Clin Proteomics. 2024 May 17;21(1):33. doi: 10.1186/s12014-024-09488-3.
COVID-19 is a complex, multi-system disease with varying severity and symptoms. Identifying changes in critically ill COVID-19 patients' proteomes enables a better understanding of markers associated with susceptibility, symptoms, and treatment. We performed plasma antibody microarray and machine learning analyses to identify novel proteins of COVID-19.
A case-control study comparing the concentration of 2000 plasma proteins in age- and sex-matched COVID-19 inpatients, non-COVID-19 sepsis controls, and healthy control subjects. Machine learning was used to identify a unique proteome signature in COVID-19 patients. Protein expression was correlated with clinically relevant variables and analyzed for temporal changes over hospitalization days 1, 3, 7, and 10. Expert-curated protein expression information was analyzed with Natural language processing (NLP) to determine organ- and cell-specific expression.
Machine learning identified a 28-protein model that accurately differentiated COVID-19 patients from ICU non-COVID-19 patients (accuracy = 0.89, AUC = 1.00, F1 = 0.89) and healthy controls (accuracy = 0.89, AUC = 1.00, F1 = 0.88). An optimal nine-protein model (PF4V1, NUCB1, CrkL, SerpinD1, Fen1, GATA-4, ProSAAS, PARK7, and NET1) maintained high classification ability. Specific proteins correlated with hemoglobin, coagulation factors, hypertension, and high-flow nasal cannula intervention (P < 0.01). Time-course analysis of the 28 leading proteins demonstrated no significant temporal changes within the COVID-19 cohort. NLP analysis identified multi-system expression of the key proteins, with the digestive and nervous systems being the leading systems.
The plasma proteome of critically ill COVID-19 patients was distinguishable from that of non-COVID-19 sepsis controls and healthy control subjects. The leading 28 proteins and their subset of 9 proteins yielded accurate classification models and are expressed in multiple organ systems. The identified COVID-19 proteomic signature helps elucidate COVID-19 pathophysiology and may guide future COVID-19 treatment development.
新型冠状病毒肺炎(COVID-19)是一种复杂的多系统疾病,严重程度和症状各异。识别危重症COVID-19患者蛋白质组的变化有助于更好地理解与易感性、症状及治疗相关的标志物。我们进行了血浆抗体微阵列和机器学习分析,以识别COVID-19的新型蛋白质。
一项病例对照研究,比较年龄和性别匹配的COVID-19住院患者、非COVID-19脓毒症对照者和健康对照者中2000种血浆蛋白的浓度。使用机器学习识别COVID-19患者独特的蛋白质组特征。将蛋白质表达与临床相关变量进行关联,并分析其在住院第1、3、7和10天的时间变化。使用自然语言处理(NLP)分析专家整理的蛋白质表达信息,以确定器官和细胞特异性表达。
机器学习识别出一个由28种蛋白质组成的模型,该模型能准确区分COVID-19患者与ICU非COVID-19患者(准确率=0.89,曲线下面积[AUC]=1.00,F1值=0.89)以及健康对照者(准确率=0.89,AUC=1.00,F1值=0.88)。一个最佳的由9种蛋白质组成的模型(PF4V1、NUCB1、CrkL、SerpinD1、Fen1、GATA-4、ProSAAS、PARK7和NET1)保持了较高的分类能力。特定蛋白质与血红蛋白、凝血因子、高血压及高流量鼻导管干预相关(P<0.01)。对28种主要蛋白质的时间进程分析表明,COVID-19队列中无显著的时间变化。NLP分析确定了关键蛋白质的多系统表达,消化系统和神经系统是主要系统。
危重症COVID-19患者的血浆蛋白质组与非COVID-19脓毒症对照者和健康对照者的血浆蛋白质组不同。领先的28种蛋白质及其9种蛋白质的子集产生了准确的分类模型,并在多个器官系统中表达。识别出的COVID-19蛋白质组特征有助于阐明COVID-19的病理生理学,并可能为未来COVID-19治疗的发展提供指导。