Laboratory of Information Access and Synthesis of TCM Four Diagnosis, Basic Medical College, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China.
BMC Complement Altern Med. 2010 Jul 20;10:37. doi: 10.1186/1472-6882-10-37.
Coronary heart disease (CHD) is a common cardiovascular disease that is extremely harmful to humans. In Traditional Chinese Medicine (TCM), the diagnosis and treatment of CHD have a long history and ample experience. However, the non-standard inquiry information influences the diagnosis and treatment in TCM to a certain extent. In this paper, we study the standardization of inquiry information in the diagnosis of CHD and design a diagnostic model to provide methodological reference for the construction of quantization diagnosis for syndromes of CHD. In the diagnosis of CHD in TCM, there could be several patterns of syndromes for one patient, while the conventional single label data mining techniques could only build one model at a time. Here a novel multi-label learning (MLL) technique is explored to solve this problem.
Standardization scale on inquiry diagnosis for CHD in TCM is designed, and the inquiry diagnostic model is constructed based on collected data by the MLL techniques. In this study, one popular MLL algorithm, ML-kNN, is compared with other two MLL algorithms RankSVM and BPMLL as well as one commonly used single learning algorithm, k-nearest neighbour (kNN) algorithm. Furthermore the influence of symptom selection to the diagnostic model is investigated. After the symptoms are removed by their frequency from low to high; the diagnostic models are constructed on the remained symptom subsets.
A total of 555 cases are collected for the modelling of inquiry diagnosis of CHD. The patients are diagnosed clinically by fusing inspection, pulse feeling, palpation and the standardized inquiry information. Models of six syndromes are constructed by ML-kNN, RankSVM, BPMLL and kNN, whose mean results of accuracy of diagnosis reach 77%, 71%, 75% and 74% respectively. After removing symptoms of low frequencies, the mean accuracy results of modelling by ML-kNN, RankSVM, BPMLL and kNN reach 78%, 73%, 75% and 76% when 52 symptoms are remained.
The novel MLL techniques facilitate building standardized inquiry models in CHD diagnosis and show a practical approach to solve the problem of labelling multi-syndromes simultaneously.
冠心病(CHD)是一种常见的心血管疾病,对人类危害极大。在中医(TCM)中,CHD 的诊断和治疗有着悠久的历史和丰富的经验。然而,非标准的查询信息在一定程度上影响了 TCM 的诊断和治疗。本文研究了 CHD 诊断中查询信息的标准化问题,并设计了一个诊断模型,为 CHD 证候的量化诊断构建提供方法学参考。在 TCM 中诊断 CHD 时,一个患者可能有多种证候模式,而传统的单标签数据挖掘技术一次只能构建一个模型。因此,本文探索了一种新的多标签学习(MLL)技术来解决这个问题。
设计了中医 CHD 问诊诊断标准化量表,基于 MLL 技术采集的数据构建了问诊诊断模型。本研究比较了一种流行的 MLL 算法 ML-kNN,以及另外两种 MLL 算法 RankSVM 和 BPMLL,以及一种常用的单学习算法 k-最近邻(kNN)算法。此外,还研究了症状选择对诊断模型的影响。通过从低到高的频率去除症状后,在剩余的症状子集上构建诊断模型。
共收集了 555 例 CHD 问诊诊断建模病例。患者通过融合检查、脉象、触诊和标准化问诊信息进行临床诊断。通过 ML-kNN、RankSVM、BPMLL 和 kNN 构建了六种证候模型,其诊断准确率的平均结果分别达到 77%、71%、75%和 74%。去除低频症状后,当保留 52 个症状时,ML-kNN、RankSVM、BPMLL 和 kNN 的建模平均准确率分别达到 78%、73%、75%和 76%。
新型 MLL 技术有利于建立 CHD 诊断的标准化查询模型,为解决多证候同时标记的问题提供了一种实用方法。