GlycoProteomics Laboratory, Department of Parasitology, ICB, University of São Paulo, São Paulo, Brazil.
Cellular and Molecular Biology Laboratory (LIM 15), Neurology Department, Faculdade de Medicina FMUSP, Universidade de Sao Paulo, Sao Paulo, Brazil.
Adv Protein Chem Struct Biol. 2022;131:311-339. doi: 10.1016/bs.apcsb.2022.04.002. Epub 2022 May 12.
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) was first identified in late 2019 in Wuhan, China, and has proven to be highly pathogenic, making it a global public health threat. The immediate need to understand the mechanisms and impact of the virus made omics techniques stand out, as they can offer a holistic and comprehensive view of thousands of molecules in a single experiment. Mastering bioinformatics tools to process, analyze, integrate, and interpret omics data is a powerful knowledge to enrich results. We present a robust and open access computational pipeline for extracting information from quantitative proteomics and transcriptomics public data. We present the entire pipeline from raw data to differentially expressed genes. We explore processes and pathways related to mapped transcripts and proteins. A pipeline is presented to integrate and compare proteomics and transcriptomics data using also packages available in the Bioconductor and providing the codes used. Cholesterol metabolism, immune system activity, ECM, and proteasomal degradation pathways increased in infected patients. Leukocyte activation profile was overrepresented in both proteomics and transcriptomics data. Finally, we found a panel of proteins and transcripts regulated in the same direction in the lung transcriptome and plasma proteome that distinguish healthy and infected individuals. This panel of markers was confirmed in another cohort of patients, thus validating the robustness and functionality of the tools presented.
严重急性呼吸综合征冠状病毒 2(SARS-CoV-2)于 2019 年底在中国武汉首次被发现,事实证明其具有高度致病性,对全球公共卫生构成威胁。为了立即了解病毒的机制和影响,组学技术脱颖而出,因为它们可以在单个实验中提供数千种分子的整体和综合视图。掌握生物信息学工具来处理、分析、整合和解释组学数据是丰富结果的强大知识。我们提出了一个强大的、开放获取的计算管道,用于从定量蛋白质组学和转录组学公共数据中提取信息。我们从原始数据到差异表达基因呈现了整个管道。我们探索了与映射的转录本和蛋白质相关的过程和途径。提出了一个使用 Bioconductor 中可用的软件包进行蛋白质组学和转录组学数据整合和比较的管道,并提供了使用的代码。感染患者的胆固醇代谢、免疫系统活性、细胞外基质和蛋白酶体降解途径增加。白细胞激活谱在蛋白质组学和转录组学数据中均过表达。最后,我们在肺转录组和血浆蛋白质组中发现了一组在同一方向上调节的蛋白质和转录本,这些蛋白和转录本可区分健康个体和感染个体。该标志物在另一组患者中得到了验证,从而验证了所提出的工具的稳健性和功能。