Cifuentes Paula, Zamora Ismael, Radchenko Tatiana, Fontaine Fabien, Garriga Albert, Morettoni Luca, Christensen Jesper Kammersgaard, Helleberg Hans, Becker Bridget A
Pompeu Fabra University, Barcelona Spain.
Lead Molecular Design, S.L., Sant Cugat del Valles, Spain.
PLoS One. 2025 Aug 13;20(8):e0324668. doi: 10.1371/journal.pone.0324668. eCollection 2025.
A comprehensive understanding of drug metabolism is crucial for advancements in drug development. Automation has improved various stages of this process, from compound procurement to data analysis, but significant challenges persist in the metabolite identification (MetID) of macromolecules due to their size, structural complexity, and associated computational demands. This study introduces new algorithms for automated Liquid Chromatography-High-Resolution Mass Spectrometry (LC-HRMS) data analysis applicable to macromolecules. A novel peak detection approach based on the most abundant mass (MaM) is presented and systematically compared with the monoisotopic mass (MiM) approach, commonly used in small molecules MetID. Additionally, three structure visualization strategies, expanded (atom-level), non-expanded (monomer-level), and a hybrid mode, are evaluated for their impact on computation data processing time and interpretability, based on their distinct fragmentation strategies. The workflow was validated using six diverse datasets, comprising linear and cyclic peptides and oligonucleotides with both natural and unnatural monomers, covering a molecular weight range of 700-7630 Da. A total of 970 metabolites were identified under various experimental and ionization conditions. The MaM algorithm demonstrated higher scores and a greater number of matches, instilling greater confidence in the accurate prediction of metabolite structures, while the non-expanded visualization significantly reduced processing times (ranging from minutes to under an hour for most peptides). Furthermore, the visualization algorithm, which integrates monomer-level and atom/bond notation, enables clear localization of metabolic biotransformations. Compared to previous studies, the proposed workflow demonstrated reduced processing time, consistent detection of degradation products, and enhanced visualization capabilities, advancing automated MetID for macromolecules.
全面了解药物代谢对于药物开发的进展至关重要。自动化改善了从化合物采购到数据分析的这一过程的各个阶段,但由于大分子的大小、结构复杂性以及相关的计算需求,在大分子的代谢物鉴定(MetID)方面仍然存在重大挑战。本研究介绍了适用于大分子的自动液相色谱-高分辨率质谱(LC-HRMS)数据分析的新算法。提出了一种基于最丰富质量(MaM)的新型峰检测方法,并与小分子MetID中常用的单同位素质量(MiM)方法进行了系统比较。此外,基于三种不同的碎片化策略,评估了三种结构可视化策略,即扩展(原子级)、非扩展(单体级)和混合模式,对计算数据处理时间和可解释性的影响。使用六个不同的数据集对工作流程进行了验证,这些数据集包括具有天然和非天然单体的线性和环状肽以及寡核苷酸,分子量范围为700-7630 Da。在各种实验和电离条件下共鉴定出970种代谢物。MaM算法显示出更高的分数和更多的匹配项,为代谢物结构的准确预测注入了更大的信心,而非扩展可视化显著减少了处理时间(大多数肽从几分钟到不到一小时不等)。此外,集成单体级和原子/键表示法的可视化算法能够清晰地定位代谢生物转化。与先前的研究相比,所提出的工作流程显示出处理时间减少、降解产物的一致检测以及增强的可视化能力,推动了大分子的自动MetID。