Professor, School of Management, Kyung Hee University, Seoul, Republic of Korea.
Assistant Professor, Graduate School of Global Development & Entrepreneurship, Handong Global University, Pohang, Republic of Korea.
J Healthc Eng. 2018 Oct 3;2018:7391793. doi: 10.1155/2018/7391793. eCollection 2018.
One of the significant issues in a smart city is maintaining a healthy environment. To improve the environment, huge amounts of data are gathered, manipulated, analyzed, and utilized, and these data might include noise, uncertainty, or unexpected mistreatment of the data. In some datasets, the class imbalance problem skews the learning performance of the classification algorithms. In this paper, we propose a case-based reasoning method that combines the use of crowd knowledge from open source data and collective knowledge. This method mitigates the class imbalance issues resulting from datasets, which diagnose wellness levels in patients suffering from stress or depression. We investigate effective ways to mitigate class imbalance issues in which the datasets have a higher proportion of one class over another. The results of this proposed hybrid reasoning method, using a combination of crowd knowledge extracted from open source data (i.e., a Google search, or other publicly accessible source) and collective knowledge (i.e., case-based reasoning), were that it performs better than other traditional methods (e.g., SMO, BayesNet, IBk, Logistic, C4.5, and crowd reasoning). We also demonstrate that the use of open source and big data improves the classification performance when used in addition to conventional classification algorithms.
智慧城市面临的重要问题之一是保持健康的环境。为了改善环境,需要收集、处理、分析和利用大量数据,这些数据可能包括噪声、不确定性或数据的意外滥用。在某些数据集,类不平衡问题会影响分类算法的学习性能。在本文中,我们提出了一种基于案例推理的方法,结合使用来自开源数据的众包知识和集体知识。该方法减轻了数据集类不平衡问题,这些数据集可以诊断出患有压力或抑郁的患者的健康水平。我们研究了减轻数据集类不平衡问题的有效方法,这些数据集一个类的比例高于另一个类。使用从开源数据(例如 Google 搜索或其他公共可访问的来源)中提取的众包知识和集体知识(即基于案例的推理)相结合的混合推理方法的结果表明,它比其他传统方法(例如 SMO、BayesNet、IBk、Logistic、C4.5 和众包推理)的性能更好。我们还证明了在常规分类算法之外使用开源和大数据可以提高分类性能。