Sun Wen, Bai Minghua, Wang Ji, Wang Bei, Liu Yixing, Wang Qi, Han Dongran
School of Management, Beijing University of Chinese Medicine, Beijing, 100029, China.
School of Traditional Chinese Medicine/National Institute of TCM Constitution and Preventive Medicine, Beijing University of Chinese Medicine, Beijing, 100029, China.
Chin Med. 2024 Sep 15;19(1):127. doi: 10.1186/s13020-024-00992-0.
The aim of this study was to develop a machine learning-assisted rapid determination methodology for traditional Chinese Medicine Constitution. Based on the Constitution in Chinese Medicine Questionnaire (CCMQ), the most applied diagnostic instrument for assessing individuals' constitutions, we employed automated supervised machine learning algorithms (i.e., Tree-based Pipeline Optimization Tool; TPOT) on all the possible item combinations for each subscale and an unsupervised machine learning algorithm (i.e., variable clustering; varclus) on the whole scale to select items that can best predict body constitution (BC) classifications or BC scores. By utilizing subsets of items selected based on TPOT and corresponding machine learning algorithms, the accuracies of BC classifications prediction ranged from 0.819 to 0.936, with the root mean square errors of BC scores prediction stabilizing between 6.241 and 9.877. Overall, the results suggested that the automated machine learning algorithms performed better than the varclus algorithm for item selection. Additionally, based on an automated machine learning item selection procedure, we provided the top three ranked item combinations with each possible subscale length, along with their corresponding algorithms for predicting BC classification and severity. This approach could accommodate the needs of different practitioners in traditional Chinese medicine for rapid constitution determination.
本研究的目的是开发一种用于中医体质的机器学习辅助快速判定方法。基于中医体质分类与判定量表(CCMQ)这一评估个体体质最常用的诊断工具,我们针对每个分量表的所有可能项目组合采用了自动化监督机器学习算法(即基于树的管道优化工具;TPOT),并在整个量表上采用了无监督机器学习算法(即变量聚类;varclus)来选择能够最佳预测体质分类或体质得分的项目。通过利用基于TPOT和相应机器学习算法选择的项目子集,体质分类预测的准确率在0.819至0.936之间,体质得分预测的均方根误差稳定在6.241至9.877之间。总体而言,结果表明自动化机器学习算法在项目选择方面比varclus算法表现更好。此外,基于自动化机器学习项目选择程序,我们给出了每个可能的分量表长度排名前三的项目组合,以及它们相应的用于预测体质分类和严重程度的算法。这种方法可以满足中医不同从业者对快速体质判定的需求。