Xu Hao, Jiang Tianhang, Lin Yuxiang, Zhang Lei, Yang Huan, Huang Xiaoyun, Mao Ridong, Yang Zhu, Zeng Changchun, Zhao Shuang, Di Lijun, Zhang Wenbin, Zeng Jun, Cai Zongwei, Lin Shu-Hai
The First Affiliated Hospital of Xiamen University, State Key Laboratory of Cellular Stress Biology, School of Life Sciences, XMU-HBN Skin Biomedical Research Center, Xiamen University, Xiamen, Fujian, China.
School of Medicine, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian, China.
Nat Commun. 2025 May 16;16(1):4566. doi: 10.1038/s41467-025-59683-5.
Improving annotation accuracy, coverage, speed and depth of lipid profiles remains a significant challenge in traditional lipid annotation. We introduce LipidIN, an advanced framework designed for flash platform-independent annotation. LipidIN features a 168.5-million lipid fragmentation hierarchical library that encompasses all potential chain compositions and carbon-carbon double bond locations. The expeditious querying module achieves speeds exceeding one hundred billion queries per second across all mass spectral libraries. The lipid categories intelligence model is developed using three relative retention time rules, reducing false positive annotations and predicting unannotated lipids with a 5.7% estimated false discovery rate, covering 8923 lipids cross various species. More importantly, LipidIN integrates a Wide-spectrum Modeling Yield network for regenerating lipid fragment fingerprints to further improve accuracy and coverage with a 20% estimated recall boosting. We further demonstrate the utility of LipidIN in multiple tasks for lipid annotation and biomarker discovery in clinical cohorts.
在传统脂质注释中,提高脂质谱的注释准确性、覆盖范围、速度和深度仍然是一项重大挑战。我们引入了LipidIN,这是一个专为闪存平台独立注释设计的先进框架。LipidIN具有一个包含1.685亿个脂质碎片的分层库,涵盖了所有潜在的链组成和碳-碳双键位置。快速查询模块在所有质谱库中实现了每秒超过1000亿次查询的速度。脂质类别智能模型是使用三个相对保留时间规则开发的,减少了假阳性注释,并以5.7%的估计错误发现率预测未注释的脂质,覆盖了8923种跨物种的脂质。更重要的是,LipidIN集成了一个宽谱建模产量网络,用于再生脂质碎片指纹,以进一步提高准确性和覆盖范围,估计召回率提高20%。我们进一步展示了LipidIN在临床队列中脂质注释和生物标志物发现的多项任务中的实用性。
BMC Bioinformatics. 2011-7-26
Nat Biotechnol. 2020-6-15
Mass Spectrom Rev. 2025
Nat Aging. 2024-5
Nat Methods. 2024-2
Nucleic Acids Res. 2024-1-5
Nat Methods. 2023-10