Gittelsohn A M
Am J Public Health. 1982 Feb;72(2):133-40. doi: 10.2105/ajph.72.2.133.
The feasibility of applying surveillance techniques to large health data sets is being explored through study of a national mortality data base encompassing 21 million United States death records for the period 1968--1978. Through the development of efficient file structures and information recovery techniques, it is possible to pose a series of questions and follow-up questions of the entire data set within budgetary constraints. Initial screening of the mortality data base reveals that major changes have occurred over the 11 years with marked declines for diseases of cardiovascular, respiratory, digestive and renal systems, and maternal and perinatal mortality. There is a tendency for increased usage of non-specific terminology. The occurrence of unlikely and unusual causes in the data set is documented and reasons for their inclusion discussed in terms of underlying cause of death logic. Problems in the study of geographic distributions of cause specific mortality are outlined with illustrations of the dispersion of standardized mortality ratios for major causes of death over areas of the country. Clusters of high mortality areas require interpretation in terms of underlying dispersion and possible reporting artifacts arising out of geographic differentials in diagnostic labeling practice.
通过对一个涵盖1968年至1978年期间2100万份美国死亡记录的国家死亡率数据库进行研究,正在探索将监测技术应用于大型健康数据集的可行性。通过开发高效的文件结构和信息恢复技术,有可能在预算限制范围内对整个数据集提出一系列问题及后续问题。对死亡率数据库的初步筛查显示,在这11年中发生了重大变化,心血管、呼吸、消化和泌尿系统疾病以及孕产妇和围产期死亡率显著下降。使用非特定术语的趋势有所增加。记录了数据集中不太可能和不寻常病因的发生情况,并根据根本死因逻辑讨论了将其纳入的原因。概述了特定病因死亡率地理分布研究中的问题,并举例说明了主要死因的标准化死亡率在该国各地区的分布情况。高死亡率地区的聚集需要根据潜在的分布情况以及诊断标签实践中的地理差异可能产生的报告假象进行解释。