Li Shuzhao, Zheng Shujian
Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA.
bioRxiv. 2023 Jan 4:2023.01.04.522722. doi: 10.1101/2023.01.04.522722.
In untargeted metabolomics, multiple ions are often measured for each original metabolite, including isotopic forms and in-source modifications, such as adducts and fragments. Without prior knowledge of the chemical identity or formula, computational organization and interpretation of these ions is challenging, which is the deficit of previous software tools that perform the task using network algorithms. We propose here a generalized tree structure to annotate ions to relationships to the original compound and infer neutral mass. An algorithm is presented to convert mass distance networks to this tree structure with high fidelity. This method is useful for both regular untargeted metabolomics and stable isotope tracing experiments. It is implemented as a Python package (khipu), and provides a JSON format for easy data exchange and software interoperability. By generalized pre-annotation, khipu makes it feasible to connect metabolomics data with common data science tools, and supports flexible experimental designs.
在非靶向代谢组学中,通常会针对每种原始代谢物测量多个离子,包括同位素形式以及源内修饰,如加合物和碎片。在没有化学身份或分子式先验知识的情况下,对这些离子进行计算组织和解释具有挑战性,这正是以往使用网络算法执行该任务的软件工具的不足之处。我们在此提出一种广义树结构,用于将离子注释为与原始化合物的关系并推断中性质量。提出了一种算法,可将质量距离网络高保真地转换为此树结构。该方法对于常规非靶向代谢组学和稳定同位素示踪实验均有用。它作为一个Python包(khipu)实现,并提供JSON格式以便于数据交换和软件互操作性。通过广义预注释,khipu使代谢组学数据与通用数据科学工具相连接变得可行,并支持灵活的实验设计。