Medical Informatics Research Center, Ben Gurion University, Beer Sheva, Israel.
J Eval Clin Pract. 2012 Apr;18(2):378-88. doi: 10.1111/j.1365-2753.2010.01592.x. Epub 2010 Dec 19.
To test the feasibility of classifying emergency department patients into severity grades using data mining methods.
Emergency department records of 402 patients were classified into five severity grades by two expert physicians. The Naïve Bayes and C4.5 algorithms were applied to produce classifiers from patient data into severity grades. The classifiers' results over several subsets of the data were compared with the physicians' assessments, with a random classifier, and with a classifier that selects the maximal-prevalence class.
Positive predictive value, multiple-class extensions of sensitivity and specificity combinations, and entropy change.
The mean accuracy of the data mining classifiers was 52.94 ± 5.89%, significantly better (P < 0.05) than the mean accuracy of a random classifier (34.60 ± 2.40%). The entropy of the input data sets was reduced through classification by a mean of 10.1%. Allowing for classification deviations of one severity grade led to mean accuracy of 85.42 ± 1.42%. The classifiers' accuracy in that case was similar to the physicians' consensus rate. Learning from consensus records led to better performance. Reducing the number of severity grades improved results in certain cases. The performance of the Naïve Bayes and C4.5 algorithms was similar; in unbalanced data sets, Naïve Bayes performed better.
It is possible to produce a computerized classification model for the severity grade of triage patients, using data mining methods. Learning from patient records regarding which there is a consensus of several physicians is preferable to learning from each physician's patients. Either Naïve Bayes or C4.5 can be used; Naïve Bayes is preferable for unbalanced data sets. An ambiguity in the intermediate severity grades seems to hamper both the physicians' agreement and the classifiers' accuracy.
利用数据挖掘方法检验对急诊科患者进行严重度分级的可行性。
由两位专家医师对 402 例急诊科患者的记录进行严重度分级,采用 Naive Bayes 和 C4.5 算法从患者数据中生成对严重度分级的分类器。将数据的若干子集上分类器的结果与医师的评估、随机分类器和选择最大流行度类别的分类器进行比较。
阳性预测值、多类别灵敏度和特异性组合的扩展及信息熵变化。
数据挖掘分类器的平均准确率为 52.94%±5.89%,显著高于(P<0.05)随机分类器(34.60%±2.40%)的平均准确率。分类使输入数据集的信息熵平均降低了 10.1%。允许分类偏差为 1 个严重度级别,平均准确率为 85.42%±1.42%。此时分类器的准确率与医师的共识率相似。从共识记录中学习可提高性能。在某些情况下,减少严重度级别可提高结果。Naive Bayes 和 C4.5 算法的性能相似;在不平衡数据集上,Naive Bayes 的性能更好。
利用数据挖掘方法为分诊患者的严重度分级生成计算机分类模型是可行的。从有多位医师共识的患者记录中学习优于从每位医师的患者记录中学习。可以使用 Naive Bayes 或 C4.5;Naive Bayes 更适合不平衡数据集。中间严重度级别的模糊性似乎既妨碍了医师的一致性,也降低了分类器的准确性。