Health Information Management Research Center, Kashan University of Medical Sciences, Kashan, Iran.
Department of Health Information Management and Technology, Allied Medical Sciences Faculty, Kashan University of Medical Sciences, Kashan, Iran.
BMC Med Res Methodol. 2024 Feb 16;24(1):40. doi: 10.1186/s12874-024-02154-0.
Data mining has been used to help discover Frequent patterns in health data. it is widely used to diagnose and prevent various diseases and to obtain the causes and factors affecting diseases. Therefore, the aim of the present study is to discover frequent patterns in the data of the Kashan Trauma Registry based on a new method.
We utilized real data from the Kashan Trauma Registry. After pre-processing, frequent patterns and rules were extracted based on the classical Apriori algorithm and the new method. The new method based on the weight of variables and the harmonic mean was presented for the automatic calculation of minimum support with the Python.
The results showed that the minimum support generation based on the weighting features is done dynamically and level by level, while in the classic Apriori algorithm considering that only one value is considered for the minimum support manually by the user. Also, the performance of the new method was better compared to the classical Apriori method based on the amount of memory consumption, execution time, the number of frequent patterns found and the generated rules.
This study found that manually determining the minimal support increases execution time and memory usage, which is not cost-effective, especially when the user does not know the dataset's content. In trauma registries and massive healthcare datasets, its ability to uncover common item groups and association rules provides valuable insights. Also, based on the patterns produced in the trauma data, the care of the elderly by their families, education to the general public about encountering patients who have an accident and how to transport them to the hospital, education to motorcyclists to observe safety points in Recommended when using a motorcycle.
数据挖掘已被用于帮助发现健康数据中的频繁模式。它被广泛用于诊断和预防各种疾病,并获取影响疾病的原因和因素。因此,本研究的目的是基于一种新方法发现 Kashan 创伤登记处数据中的频繁模式。
我们利用了 Kashan 创伤登记处的真实数据。在预处理之后,基于经典的 Apriori 算法和新方法提取了频繁模式和规则。新方法基于变量的权重和调和平均值,使用 Python 自动计算最小支持。
结果表明,基于加权特征的最小支持生成是动态和分层进行的,而在经典 Apriori 算法中,最小支持是由用户手动考虑一个值。此外,新方法的性能优于经典 Apriori 方法,在内存消耗、执行时间、发现的频繁模式数量和生成的规则数量方面表现更好。
本研究发现,手动确定最小支持会增加执行时间和内存使用,这不具有成本效益,尤其是当用户不了解数据集的内容时。在创伤登记处和大规模医疗保健数据集中,它揭示常见项目组和关联规则的能力提供了有价值的见解。此外,基于创伤数据中产生的模式,可以为老年人提供家庭护理,向公众提供关于遇到事故患者以及如何将他们送往医院的教育,向骑摩托车者提供有关在推荐时观察摩托车安全要点的教育。