School of Chemistry, Chemical Engineering and Biotechnology, Nanyang Technological University, 21 Nanyang Link, Nanyang, 637371, Singapore.
School of Chemical and Material Engineering, Jiangnan University, Wuxi, 214122, People's Republic of China.
Nat Commun. 2024 Mar 22;15(1):2582. doi: 10.1038/s41467-024-46838-z.
Achieving untargeted chemical identification, isomeric differentiation, and quantification is critical to most scientific and technological problems but remains challenging. Here, we demonstrate an integrated SERS-based chemical taxonomy machine learning framework for untargeted structural elucidation of 11 epimeric cerebrosides, attaining >90% accuracy and robust single epimer and multiplex quantification with <10% errors. First, we utilize 4-mercaptophenylboronic acid to selectively capture the epimers at molecular sites of isomerism to form epimer-specific SERS fingerprints. Corroborating with in-silico experiments, we establish five spectral features, each corresponding to a structural characteristic: (1) presence/absence of epimers, (2) monosaccharide/cerebroside, (3) saturated/unsaturated cerebroside, (4) glucosyl/galactosyl, and (5) GlcCer or GalCer's carbon chain lengths. Leveraging these insights, we create a fully generalizable framework to identify and quantify cerebrosides at concentrations between 10 to 10M and achieve multiplex quantification of binary mixtures containing biomarkers GlcCer, and GalCer using their untrained spectra in the models.
实现非靶向化学物质鉴定、同分异构体区分和定量分析对于大多数科学和技术问题至关重要,但仍然具有挑战性。在这里,我们展示了一个基于 SERS 的集成化学分类机器学习框架,用于对 11 种差向异构脑苷脂进行非靶向结构阐明,达到了>90%的准确性,并具有稳健的单差向异构体和<10%误差的多重定量分析能力。首先,我们利用 4-巯基苯硼酸选择性地在异构体的分子部位捕获差向异构体,形成差向异构体特异性 SERS 指纹图谱。通过与计算机实验相结合,我们建立了五个光谱特征,每个特征对应一个结构特征:(1)差向异构体的存在/不存在,(2)单糖/脑苷脂,(3)饱和/不饱和脑苷脂,(4)葡萄糖基/半乳糖基,以及(5)GlcCer 或 GalCer 的碳链长度。利用这些见解,我们创建了一个完全可推广的框架,用于在 10 到 10M 之间的浓度范围内鉴定和定量脑苷脂,并利用模型中未训练的光谱实现包含生物标志物 GlcCer 和 GalCer 的二元混合物的多重定量分析。