Jang Jun-Su, Ku Boncho, Kim Young-Su, Nam Jiho, Kim Keun Ho, Kim Jong Yeol
Medical Engineering R&D Group, Medical Research Division, Korea Institute of Oriental Medicine, 1672 Yuseongdae-ro, Yuseong-gu, Daejeon 305-811, Republic of Korea.
BMC Complement Altern Med. 2013 Nov 7;13:307. doi: 10.1186/1472-6882-13-307.
Sasang constitutional medicine (SCM) is a type of tailored medicine that divides human beings into four Sasang constitutional (SC) types. Diagnosis of SC types is crucial to proper treatment in SCM. Voice characteristics have been used as an essential clue for diagnosing SC types. In the past, many studies tried to extract quantitative vocal features to make diagnosis models; however, these studies were flawed by limited data collected from one or a few sites, long recording time, and low accuracy. We propose a practical diagnosis model having only a few variables, which decreases model complexity. This in turn, makes our model appropriate for clinical applications.
A total of 2,341 participants' voice recordings were used in making a SC classification model and to test the generalization ability of the model. Although the voice data consisted of five vowels and two repeated sentences per participant, we used only the sentence part for our study. A total of 21 features were extracted, and an advanced feature selection method-the least absolute shrinkage and selection operator (LASSO)-was applied to reduce the number of variables for classifier learning. A SC classification model was developed using multinomial logistic regression via LASSO.
We compared the proposed classification model to the previous study, which used both sentences and five vowels from the same patient's group. The classification accuracies for the test set were 47.9% and 40.4% for male and female, respectively. Our result showed that the proposed method was superior to the previous study in that it required shorter voice recordings, is more applicable to practical use, and had better generalization performance.
We proposed a practical SC classification method and showed that our model having fewer variables outperformed the model having many variables in the generalization test. We attempted to reduce the number of variables in two ways: 1) the initial number of candidate features was decreased by considering shorter voice recording, and 2) LASSO was introduced for reducing model complexity. The proposed method is suitable for an actual clinical environment. Moreover, we expect it to yield more stable results because of the model's simplicity.
四象体质医学(SCM)是一种个性化医学,它将人类分为四种四象体质(SC)类型。SC类型的诊断对于SCM的正确治疗至关重要。声音特征一直被用作诊断SC类型的重要线索。过去,许多研究试图提取定量的声音特征来建立诊断模型;然而,这些研究存在缺陷,如从一个或几个地点收集的数据有限、录音时间长以及准确性低。我们提出了一个仅包含几个变量的实用诊断模型,这降低了模型的复杂性。进而,使我们的模型适用于临床应用。
总共2341名参与者的声音记录被用于建立SC分类模型并测试该模型的泛化能力。尽管声音数据包括每个参与者的五个元音和两个重复的句子,但我们的研究仅使用句子部分。总共提取了21个特征,并应用了一种先进的特征选择方法——最小绝对收缩和选择算子(LASSO)——来减少用于分类器学习的变量数量。通过LASSO使用多项逻辑回归建立了SC分类模型。
我们将提出的分类模型与之前的研究进行了比较,之前的研究使用了同一患者组的句子和五个元音。测试集的男性和女性分类准确率分别为47.9%和40.4%。我们的结果表明,所提出的方法优于之前的研究,因为它需要更短的声音记录,更适用于实际应用,并且具有更好的泛化性能。
我们提出了一种实用的SC分类方法,并表明我们的变量较少的模型在泛化测试中优于变量较多的模型。我们试图通过两种方式减少变量数量:1)通过考虑更短的声音记录来减少候选特征的初始数量,2)引入LASSO来降低模型复杂性。所提出的方法适用于实际临床环境。此外,由于模型的简单性,我们期望它能产生更稳定的结果。