Zhang Fangfei, Luna Augustin, Tan Tingting, Chen Yingdan, Sander Chris, Guo Tiannan
bioRxiv. 2022 Sep 30:2022.09.27.509819. doi: 10.1101/2022.09.27.509819.
The ongoing pandemic of the coronavirus disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) still has limited treatment options partially due to our incomplete understanding of the molecular dysregulations of the COVID-19 patients. We aimed to generate a repository and data analysis tools to examine the modulated proteins underlying COVID-19 patients for the discovery of potential therapeutic targets and diagnostic biomarkers.
We built a web server containing proteomic expression data from COVID-19 patients with a toolset for user-friendly data analysis and visualization. The web resource covers expert-curated proteomic data from COVID-19 patients published before May 2022. The data were collected from ProteomeXchange and from select publications via PubMed searches and aggregated into a comprehensive dataset. Protein expression by disease subgroups across projects was compared by examining differentially expressed proteins. We also visualize differentially expressed pathways and proteins. Moreover, circulating proteins that differentiated severe cases were nominated as predictive biomarkers.
We built and maintain a web server COVIDpro ( https://www.guomics.com/covidPro/ ) containing proteomics data generated by 41 original studies from 32 hospitals worldwide, with data from 3077 patients covering 19 types of clinical specimens, the majority from plasma and sera. 53 protein expression matrices were collected, for a total of 5434 samples and 14,403 unique proteins. Our analyses showed that the lipopolysaccharide-binding protein, as identified in the majority of the studies, was highly expressed in the blood samples of patients with severe disease. A panel of significantly dysregulated proteins was identified to separate patients with severe disease from non-severe disease. Classification of severe disease based on these proteomic signatures on five test sets reached a mean AUC of 0.87 and ACC of 0.80.
COVIDpro is an online database with an integrated analysis toolkit. It is a unique and valuable resource for testing hypotheses and identifying proteins or pathways that could be targeted by new treatments of COVID-19 patients.
National Key R&D Program of China: Key PDPM technologies (2021YFA1301602, 2021YFA1301601, 2021YFA1301603), Zhejiang Provincial Natural Science Foundation for Distinguished Young Scholars (LR19C050001), Hangzhou Agriculture and Society Advancement Program (20190101A04), National Natural Science Foundation of China (81972492) and National Science Fund for Young Scholars (21904107), National Resource for Network Biology (NRNB) from the National Institute of General Medical Sciences (NIGMS-P41 GM103504).
Although an increasing number of therapies against COVID-19 are being developed, they are still insufficient, especially with the rise of new variants of concern. This is partially due to our incomplete understanding of the disease’s mechanisms. As data have been collected worldwide, several questions are now worth addressing via meta-analyses. Most COVID-19 drugs function by targeting or affecting proteins. Effectiveness and resistance to therapeutics can be effectively assessed via protein measurements. Empowered by mass spectrometry-based proteomics, protein expression has been characterized in a variety of patient specimens, including body fluids (e.g., serum, plasma, urea) and tissue (i.e., formalin-fixed and paraffin-embedded (FFPE)). We expert-curated proteomic expression data from COVID-19 patients published before May 2022, from the largest proteomic data repository ProteomeXhange as well as from literature search engines. Using this resource, a COVID-19 proteome meta-analysis could provide useful insights into the mechanisms of the disease and identify new potential drug targets. We integrated many published datasets from patients with COVID-19 from 11 nations, with over 3000 patients and more than 5434 proteome measurements. We collected these datasets in an online database, and generated a toolbox to easily explore, analyze, and visualize the data. Next, we used the database and its associated toolbox to identify new proteins of diagnostic and therapeutic value for COVID-19 treatment. In particular, we identified a set of significantly dysregulated proteins for distinguishing severe from non-severe patients using serum samples. COVIDpro will support the navigation and analysis of patterns of dysregulated proteins in various COVID-19 clinical specimens for identification and verification of protein biomarkers and potential therapeutic targets.
由严重急性呼吸综合征冠状病毒2(SARS-CoV-2)引起的2019冠状病毒病(COVID-19)大流行目前仍缺乏有效的治疗方案,部分原因是我们对COVID-19患者分子失调的认识尚不全面。我们旨在建立一个数据库和数据分析工具,以研究COVID-19患者体内被调节的蛋白质,从而发现潜在的治疗靶点和诊断生物标志物。
我们构建了一个包含COVID-19患者蛋白质组表达数据的网络服务器,并配备了便于用户进行数据分析和可视化的工具集。该网络资源涵盖了2022年5月之前发表的经专家整理的COVID-19患者蛋白质组数据。数据从蛋白质组交换库(ProteomeXchange)收集,并通过PubMed搜索从选定的出版物中获取,汇总成一个综合数据集。通过检查差异表达的蛋白质,比较了各项目中不同疾病亚组的蛋白质表达情况。我们还对差异表达的通路和蛋白质进行了可视化展示。此外,将区分重症病例的循环蛋白指定为预测生物标志物。
COVIDpro是一个带有综合分析工具包的在线数据库。它是一个独特且有价值的资源,可用于检验假设,以及识别可作为COVID-19患者新治疗靶点的蛋白质或通路。
中国国家重点研发计划:关键蛋白质药物研发技术(2021YFA1301602、2021YFA1301601、2021YFA1301603),浙江省杰出青年科学基金(LR19C050001),杭州市农业与社会发展项目(20190101A04),中国国家自然科学基金(81972492)和国家青年科学基金(21904107),美国国立综合医学科学研究所(NIGMS-P41 GM103504)提供的国家网络生物学资源(NRNB)。
尽管针对COVID-19的治疗方法越来越多,但仍显不足,尤其是随着新的关注变种的出现。部分原因是我们对该疾病机制的认识尚不完整。随着全球范围内数据的收集,现在有几个问题值得通过荟萃分析来解决。大多数COVID-19药物通过靶向或影响蛋白质发挥作用。通过蛋白质测量可以有效评估治疗的有效性和耐药性。基于质谱的蛋白质组学技术使我们能够对包括体液(如血清、血浆、尿液)和组织(即福尔马林固定石蜡包埋(FFPE)组织)在内的各种患者样本中的蛋白质表达进行表征。我们对2022年5月之前发表的COVID-19患者蛋白质组表达数据进行了专家整理,这些数据来自最大的蛋白质组数据库ProteomeXchange以及文献搜索引擎。利用这些资源,对COVID-19进行蛋白质组荟萃分析可以为疾病机制提供有用的见解,并识别新的潜在药物靶点。我们整合了来自11个国家的许多已发表的COVID-19患者数据集,涉及3000多名患者和超过5434次蛋白质组测量。我们将这些数据集收集到一个在线数据库中,并生成了一个工具箱,以便轻松探索、分析和可视化数据。接下来,我们使用该数据库及其相关工具箱来识别对COVID-19治疗具有诊断和治疗价值的新蛋白质。特别是,我们通过血清样本确定了一组显著失调的蛋白质,用于区分重症患者和非重症患者。COVIDpro将支持对各种COVID-19临床样本中失调蛋白质模式的导航和分析,以识别和验证蛋白质生物标志物及潜在治疗靶点。