Zhao Yufeng, He Liyun, Xie Qi, Li Guozheng, Liu Baoyan, Wang Jian, Zhang Xiaoping, Zhang Xiang, Luo Lin, Li Kun, Jing Xianghong
Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China ; Key Laboratory of Advanced Information Science and Network Technology of Beijing, Beijing Jiaotong University, Beijing 100044, China.
Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China.
Evid Based Complement Alternat Med. 2015;2015:936290. doi: 10.1155/2015/936290. Epub 2015 Jun 9.
We consider the analysis of an AIDS dataset where each patient is characterized by a list of symptoms and is labeled with one or more TCM syndromes. The task is to build a classifier that maps symptoms to TCM syndromes. We use the minimum reference set-based multiple instance learning (MRS-MIL) method. The method identifies a list of representative symptoms for each syndrome and builds a Gaussian mixture model based on them. The models for all syndromes are then used for classification via Bayes rule. By relying on a subset of key symptoms for classification, MRS-MIL can produce reliable and high quality classification rules even on datasets with small sample size. On the AIDS dataset, it achieves average precision and recall 0.7736 and 0.7111, respectively. Those are superior to results achieved by alternative methods.
我们考虑对一个艾滋病数据集进行分析,其中每个患者由一系列症状表征,并被标记有一种或多种中医证候。任务是构建一个将症状映射到中医证候的分类器。我们使用基于最小参考集的多示例学习(MRS-MIL)方法。该方法为每个证候识别一组代表性症状,并基于这些症状构建高斯混合模型。然后,所有证候的模型通过贝叶斯规则用于分类。通过依靠关键症状的子集进行分类,即使在样本量较小的数据集上,MRS-MIL也能产生可靠且高质量的分类规则。在艾滋病数据集上,它分别实现了平均精度和召回率0.7736和0.7111。这些结果优于其他方法所取得的结果。