Department of Industrial Engineering, University of Torbat Heydarieh, Torbat Heydarieh, Iran.
Sci Rep. 2024 Aug 2;14(1):17956. doi: 10.1038/s41598-024-69029-8.
The symptoms of diseases can vary among individuals and may remain undetected in the early stages. Detecting these symptoms is crucial in the initial stage to effectively manage and treat cases of varying severity. Machine learning has made major advances in recent years, proving its effectiveness in various healthcare applications. This study aims to identify patterns of symptoms and general rules regarding symptoms among patients using supervised and unsupervised machine learning. The integration of a rule-based machine learning technique and classification methods is utilized to extend a prediction model. This study analyzes patient data that was available online through the Kaggle repository. After preprocessing the data and exploring descriptive statistics, the Apriori algorithm was applied to identify frequent symptoms and patterns in the discovered rules. Additionally, the study applied several machine learning models for predicting diseases, including stepwise regression, support vector machine, bootstrap forest, boosted trees, and neural-boosted methods. Several predictive machine learning models were applied to the dataset to predict diseases. It was discovered that the stepwise method for fitting outperformed all competitors in this study, as determined through cross-validation conducted for each model based on established criteria. Moreover, numerous significant decision rules were extracted in the study, which can streamline clinical applications without the need for additional expertise. These rules enable the prediction of relationships between symptoms and diseases, as well as between different diseases. Therefore, the results obtained in this study have the potential to improve the performance of prediction models. We can discover diseases symptoms and general rules using supervised and unsupervised machine learning for the dataset. Overall, the proposed algorithm can support not only healthcare professionals but also patients who face cost and time constraints in diagnosing and treating these diseases.
疾病的症状在个体之间可能存在差异,并且在早期可能未被发现。在初始阶段检测这些症状对于有效管理和治疗不同严重程度的病例至关重要。机器学习近年来取得了重大进展,证明了其在各种医疗保健应用中的有效性。本研究旨在使用监督和无监督机器学习来识别患者症状的模式和一般规律。整合基于规则的机器学习技术和分类方法,以扩展预测模型。本研究分析了通过 Kaggle 存储库在线提供的患者数据。在对数据进行预处理并探索描述性统计后,应用 Apriori 算法来识别发现规则中的常见症状和模式。此外,该研究还应用了几种用于预测疾病的机器学习模型,包括逐步回归、支持向量机、引导森林、增强树和神经增强方法。将几个预测性机器学习模型应用于数据集来预测疾病。通过为每个模型基于既定标准进行交叉验证发现,逐步方法在拟合方面优于本研究中的所有竞争对手。此外,该研究还提取了许多重要的决策规则,可以简化临床应用,而无需额外的专业知识。这些规则可以预测症状与疾病之间以及不同疾病之间的关系。因此,本研究中的结果有可能提高预测模型的性能。我们可以使用监督和无监督机器学习来发现数据集的疾病症状和一般规律。总的来说,该算法不仅可以为医疗保健专业人员提供支持,还可以为那些在诊断和治疗这些疾病时面临成本和时间限制的患者提供支持。