Barrero-Rodríguez Rafael, Rodriguez Jose Manuel, Tarifa Rocío, Vázquez Jesús, Mastrangelo Annalaura, Ferrarini Alessia
Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares (CNIC), Madrid, Spain.
Immunobiology Laboratory, Centro Nacional de Investigaciones Cardiovasculares (CNIC), Madrid, Spain.
Front Mol Biosci. 2022 Sep 8;9:952149. doi: 10.3389/fmolb.2022.952149. eCollection 2022.
Untargeted metabolomics aims at measuring the entire set of metabolites in a wide range of biological samples. However, due to the high chemical diversity of metabolites that range from small to large and more complex molecules (i.e., amino acids/carbohydrates vs. phospholipids/gangliosides), the identification and characterization of the metabolome remain a major bottleneck. The first step of this process consists of searching the experimental monoisotopic mass against databases, thus resulting in a highly redundant/complex list of candidates. Despite the progress in this area, researchers are still forced to manually explore the resulting table in order to prioritize the most likely identifications for further biological interpretation or confirmation with standards. Here, we present TurboPutative (https://proteomics.cnic.es/TurboPutative/), a flexible and user-friendly web-based platform composed of four modules (Tagger, REname, RowMerger, and TPMetrics) that streamlines data handling, classification, and interpretability of untargeted LC-MS-based metabolomics data. Tagger classifies the different compounds and provides preliminary insights into the biological system studied. REname improves putative annotation handling and visualization, allowing the recognition of isomers and equivalent compounds and redundant data removal. RowMerger reduces the dataset size, facilitating the manual comparison among annotations. Finally, TPMetrics combines different datasets with feature intensity and relevant information for the researcher and calculates a score based on adduct probability and feature correlations, facilitating further identification, assessment, and interpretation of the results. The TurboPutative web application allows researchers in the metabolomics field that are dealing with massive datasets containing multiple putative annotations to reduce the number of these entries by 80%-90%, thus facilitating the extrapolation of biological knowledge and improving metabolite prioritization for subsequent pathway analysis. TurboPutative comprises a rapid, automated, and customizable workflow that can also be included in programmed bioinformatics pipelines through its RESTful API services. Users can explore the performance of each module through demo datasets supplied on the website. The platform will help the metabolomics community to speed up the arduous task of manual data curation that is required in the first steps of metabolite identification, improving the generation of biological knowledge.
非靶向代谢组学旨在测量各种生物样品中的所有代谢物。然而,由于代谢物的化学多样性很高,从小分子到大分子以及更复杂的分子(即氨基酸/碳水化合物与磷脂/神经节苷脂),代谢组的鉴定和表征仍然是一个主要瓶颈。这个过程的第一步是在数据库中搜索实验性单同位素质量,从而产生一个高度冗余/复杂的候选列表。尽管在这一领域取得了进展,但研究人员仍然被迫手动浏览结果表格,以便对最有可能的鉴定结果进行优先级排序,以便进行进一步的生物学解释或用标准进行确认。在这里,我们展示了TurboPutative(https://proteomics.cnic.es/TurboPutative/),这是一个灵活且用户友好的基于网络的平台,由四个模块(Tagger、REname、RowMerger和TPMetrics)组成,可简化基于非靶向液相色谱-质谱的代谢组学数据的数据处理、分类和可解释性。Tagger对不同的化合物进行分类,并对所研究的生物系统提供初步见解。REname改进了假定注释的处理和可视化,允许识别异构体和等效化合物并去除冗余数据。RowMerger减少了数据集的大小,便于注释之间的手动比较。最后,TPMetrics将不同的数据集与特征强度和相关信息结合起来提供给研究人员,并根据加合物概率和特征相关性计算得分,便于进一步鉴定、评估和解释结果。TurboPutative网络应用程序使代谢组学领域中处理包含多个假定注释的海量数据集的研究人员能够将这些条目的数量减少80%-90%,从而便于生物知识的推断,并改善后续通路分析的代谢物优先级排序。TurboPutative包含一个快速、自动化且可定制的工作流程,也可以通过其RESTful API服务包含在编程的生物信息学管道中。用户可以通过网站上提供的演示数据集探索每个模块的性能。该平台将帮助代谢组学领域加快代谢物鉴定第一步所需的艰巨手动数据整理任务,促进生物知识的生成。