Pontifícia Universidade Católica do Paraná, Curitiba, PR, Brasil.
Rev Saude Publica. 2010 Apr;44(2):292-300. doi: 10.1590/s0034-89102010000200009.
To identify, with the assistance of computational techniques, rules concerning the conditions of the physical environment for the classification of risk micro-areas.
Exploratory research carried out in Curitiba, Southern Brazil, in 2007. It was divided into three phases: the identification of attributes to classify a micro-area; the construction of a database; and the process of discovering knowledge in a database through the use of data mining. The set of attributes included the conditions of infrastructure; hydrography; soil; recreation area; community characteristics; and existence of vectors. The database was constructed with data obtained in interviews by community health workers using questionnaires with closed-ended questions, developed with the essential attributes selected by specialists.
There were 49 attributes identified, 41 of which were essential and eight irrelevant. There were 68 rules obtained in the data mining, which were analyzed through the perspectives of performance and quality and divided into two sets: the inconsistent rules and the rules that confirm the knowledge of experts. The comparison between the groups showed that the rules that confirm the knowledge, despite having lower computational performance, were considered more interesting.
The data mining provided a set of useful and understandable rules capable of characterizing risk areas based on the characteristics of the physical environment. The use of the proposed rules allows a faster and less subjective area classification, maintaining a standard between the health teams and overcoming the influence of individual perception by each team member.
借助计算技术,确定与风险小区域分类有关的物理环境条件规则。
2007 年在巴西南部库里蒂巴进行的探索性研究。它分为三个阶段:识别用于分类小区域的属性;构建数据库;以及通过数据挖掘在数据库中发现知识的过程。属性集包括基础设施状况;水文;土壤;娱乐区;社区特征;以及载体的存在。数据库是使用社区卫生工作者通过使用专家选择的封闭问题调查问卷进行访谈获得的数据构建的。
确定了 49 个属性,其中 41 个是必需的,8 个是无关的。通过数据挖掘获得了 68 条规则,这些规则通过性能和质量的角度进行了分析,并分为两组:不一致的规则和确认专家知识的规则。两组之间的比较表明,尽管确认知识的规则的计算性能较低,但被认为更有趣。
数据挖掘提供了一组有用且易于理解的规则,能够根据物理环境的特征来描述风险区域。使用建议的规则可以更快地进行分类,并且减少主观性,同时在卫生团队之间保持标准,克服每个团队成员的个人感知的影响。