Sha Yuyang, Jiang Meiting, Luo Gang, Meng Weiyu, Zhai Xiaobing, Pan Hongxin, Li Junrong, Yan Yan, Qiao Yongkang, Yang Wenzhi, Li Kefeng
Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau, China.
National Key Laboratory of Chinese Medicine Modernization, State Key Laboratory of Component-based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin, China.
Phytochem Anal. 2025 Jan;36(1):261-272. doi: 10.1002/pca.3437. Epub 2024 Aug 21.
Chinese herbal medicines have been utilized for thousands of years to prevent and treat diseases. Accurate identification is crucial since their medicinal effects vary between species and varieties. Metabolomics is a promising approach to distinguish herbs. However, current metabolomics data analysis and modeling in Chinese herbal medicines are limited by small sample sizes, high dimensionality, and overfitting.
This study aims to use metabolomics data to develop HerbMet, a high-performance artificial intelligence system for accurately identifying Chinese herbal medicines, particularly those from different species of the same genus.
We propose HerbMet, an AI-based system for accurately identifying Chinese herbal medicines. HerbMet employs a 1D-ResNet architecture to extract discriminative features from input samples and uses a multilayer perceptron for classification. Additionally, we design the double dropout regularization module to alleviate overfitting and improve model's performance.
Compared to 10 commonly used machine learning and deep learning methods, HerbMet achieves superior accuracy and robustness, with an accuracy of 0.9571 and an F1-score of 0.9542 for distinguishing seven similar Panax ginseng species. After feature selection by 25 different feature ranking techniques in combination with prior knowledge, we obtained 100% accuracy and an F1-score for discriminating P. ginseng species. Furthermore, HerbMet exhibits acceptable inference speed and computational costs compared to existing approaches on both CPU and GPU.
HerbMet surpasses existing solutions for identifying Chinese herbal medicines species. It is simple to use in real-world scenarios, eliminating the need for feature ranking and selection in classical machine learning-based methods.
中草药已被用于预防和治疗疾病数千年。由于不同物种和品种的药用效果不同,准确鉴定至关重要。代谢组学是区分草药的一种有前景的方法。然而,目前中草药代谢组学数据分析和建模受到样本量小、维度高和过拟合的限制。
本研究旨在利用代谢组学数据开发HerbMet,这是一种用于准确鉴定中草药,特别是同一属不同物种的高性能人工智能系统。
我们提出了HerbMet,这是一种基于人工智能的准确鉴定中草药的系统。HerbMet采用一维残差网络(1D-ResNet)架构从输入样本中提取判别特征,并使用多层感知器进行分类。此外,我们设计了双随机失活正则化模块来减轻过拟合并提高模型性能。
与10种常用的机器学习和深度学习方法相比,HerbMet在区分七种相似人参物种时具有更高的准确性和鲁棒性,准确率为0.9571,F1分数为0.9542。在通过25种不同的特征排序技术结合先验知识进行特征选择后,我们在鉴别人参物种时获得了100%的准确率和F1分数。此外,与现有方法相比,HerbMet在CPU和GPU上均表现出可接受的推理速度和计算成本。
HerbMet在鉴定中草药物种方面超越了现有解决方案。它在实际场景中易于使用,无需基于经典机器学习的方法进行特征排序和选择。