Suppr超能文献

地理流行病学中模式识别的一种图论方法——对贫困与死亡率的初步应用

A graph-theory method for pattern identification in geographical epidemiology--a preliminary application to deprivation and mortality.

作者信息

Maheswaran Ravi, Craigs Cheryl, Read Simon, Bath Peter A, Willett Peter

机构信息

Public Health GIS Unit, School of Health and Related Research, University of Sheffield, Regent Court, 30 Regent Street, Sheffield S14DA, UK.

出版信息

Int J Health Geogr. 2009 May 13;8:28. doi: 10.1186/1476-072X-8-28.

Abstract

BACKGROUND

Graph theoretical methods are extensively used in the field of computational chemistry to search datasets of compounds to see if they contain particular molecular sub-structures or patterns. We describe a preliminary application of a graph theoretical method, developed in computational chemistry, to geographical epidemiology in relation to testing a prior hypothesis. We tested the methodology on the hypothesis that if a socioeconomically deprived neighbourhood is situated in a wider deprived area, then that neighbourhood would experience greater adverse effects on mortality compared with a similarly deprived neighbourhood which is situated in a wider area with generally less deprivation.

METHODS

We used the Trent Region Health Authority area for this study, which contained 10,665 census enumeration districts (CED). Graphs are mathematical representations of objects and their relationships and within the context of this study, nodes represented CEDs and edges were determined by whether or not CEDs were neighbours (shared a common boundary). The overall area in this study was represented by one large graph comprising all CEDs in the region, along with their adjacency information. We used mortality data from 1988-1998, CED level population estimates and the Townsend Material Deprivation Index as an indicator of neighbourhood level deprivation. We defined deprived CEDs as those in the top 20% most deprived in the Region. We then set out to classify these deprived CEDs into seven groups defined by increasing deprivation levels in the neighbouring CEDs. 506 (24.2%) of the deprived CEDs had five adjacent CEDs and we limited pattern development and searching to these CEDs. We developed seven query patterns and used the RASCAL (Rapid Similarity Calculator) program to carry out the search for each of the query patterns. This program used a maximum common subgraph isomorphism method which was modified to handle geographical data.

RESULTS

Of the 506 deprived CEDs, 10 were not identified as belonging to any of the seven groups because they were adjacent to a CED with a missing deprivation category quintile, and none fell within query Group 1 (a deprived CED for which all five adjacent CEDs were affluent). Only four CEDs fell within Group 2, which was defined as having four affluent adjacent CEDs and one non-affluent adjacent CED. The numbers of CEDs in Groups 3-7 were 17, 214, 95, 81 and 85 respectively. Age and sex adjusted mortality rate ratios showed a non-significant trend towards increasing mortality risk across Groups (Chi-square = 3.26, df = 1, p = 0.07).

CONCLUSION

Graph theoretical methods developed in computational chemistry may be a useful addition to the current GIS based methods available for geographical epidemiology but further developmental work is required. An important requirement will be the development of methods for specifying multiple complex search patterns. Further work is also required to examine the utility of using distance, as opposed to adjacency, to describe edges in graphs, and to examine methods for pattern specification when the nodes have multiple attributes attached to them.

摘要

背景

图论方法在计算化学领域被广泛应用,用于搜索化合物数据集,以查看它们是否包含特定的分子子结构或模式。我们描述了一种在计算化学中开发的图论方法在地理流行病学中的初步应用,该应用与检验一个先验假设相关。我们以如下假设对该方法进行了测试:如果一个社会经济贫困的社区位于一个更广泛的贫困地区,那么与位于一个总体贫困程度较低的更广泛地区的类似贫困社区相比,该社区在死亡率方面将经历更大的不利影响。

方法

我们在本研究中使用了特伦特地区卫生局辖区,该辖区包含10665个人口普查枚举区(CED)。图是对象及其关系的数学表示,在本研究的背景下,节点代表CED,边由CED是否为邻居(共享共同边界)来确定。本研究中的整个区域由一个大图表示,该大图包括该地区的所有CED及其邻接信息。我们使用了1988 - 1998年的死亡率数据、CED级别的人口估计数以及汤森物质剥夺指数作为社区层面贫困程度的指标。我们将贫困的CED定义为该地区最贫困的前20%。然后,我们着手将这些贫困的CED分为七组,这些组是根据相邻CED中不断增加的贫困程度来定义的。506个(24.2%)贫困的CED有五个相邻的CED,我们将模式开发和搜索限制在这些CED上。我们开发了七种查询模式,并使用RASCAL(快速相似性计算器)程序对每种查询模式进行搜索。该程序使用了一种最大公共子图同构方法,该方法经过修改以处理地理数据。

结果

在506个贫困的CED中,有10个未被确定属于七组中的任何一组,因为它们与一个贫困类别五分位数缺失的CED相邻,并且没有一个属于查询组1(一个贫困的CED,其所有五个相邻的CED都很富裕)。只有四个CED属于组2,组2被定义为有四个富裕的相邻CED和一个不富裕的相邻CED。组3 - 7中的CED数量分别为17、214、95、81和85。年龄和性别调整后的死亡率比显示,各组之间死亡率风险增加的趋势不显著(卡方 = 3.26,自由度 = 1,p = 0.07)。

结论

计算化学中开发的图论方法可能是当前可用于地理流行病学的基于GIS的方法的有益补充,但还需要进一步的开发工作。一个重要的要求将是开发用于指定多个复杂搜索模式的方法。还需要进一步的工作来研究使用距离而非邻接来描述图中的边的效用,以及研究当节点附有多个属性时的模式指定方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82dc/2686691/1c6dd77ac802/1476-072X-8-28-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验