Rizzardi M, Mohr M S, Merrill D W, Selvin S
Information and Computing Sciences Division, Lawrence Berkeley Laboratory, CA 94720.
Stat Med. 1993 Oct;12(19-20):1953-64. doi: 10.1002/sim.4780121919.
In 1990, the United States Bureau of the Census released detailed geographic map files known as TIGER/Line (Topologically Integrated Geographic Encoding and Referencing). The TIGER files, accessible through purchase or federal repository libraries, contain 24 billion characters of data describing various geographic features including coastlines, hydrography, transportation networks, political boundaries, etc. for the entire United States. Many of these physical features are of potential interest in epidemiological case studies. Unfortunately, the TIGER data base only provides raw alphanumeric data; no utility software, graphical or otherwise, is included. Recently, the S statistical software package has been extended to include a map display function. The map function augments S's high-level approach towards statistical analysis and graphical display of data. Coupling this statistical software with the map data base developed for U.S. Census data collection will facilitate epidemiological research. We discuss the technical background necessary to utilize the TIGER data base for mapping with S. Two types of S maps, segment-based and polygon-based, are discussed along with methods to construct them from TIGER data. Polygon-based maps are useful for displaying regional statistical data, such as disease rates or incidence at the census tract level. Segment-based maps are easier to assemble and are appropriate when the data are not regionalized. Census tract data of AIDS incidence in San Francisco and lung cancer case locations relative to petrochemical refinery sites in Contra Costa County are used to illustrate the methods and potential uses of interfacing the TIGER data base with S.
1990年,美国人口普查局发布了名为TIGER/Line(拓扑集成地理编码与参照)的详细地理地图文件。TIGER文件可通过购买或从联邦存储库图书馆获取,包含240亿个字符的数据,描述了美国各地的各种地理特征,包括海岸线、水文、交通网络、政治边界等。这些自然特征中的许多在流行病学案例研究中都具有潜在的研究价值。不幸的是,TIGER数据库仅提供原始字母数字数据;未包含任何实用软件,无论是图形软件还是其他软件。最近,S统计软件包已扩展到包括地图显示功能。地图功能增强了S对数据进行统计分析和图形显示的高级方法。将这种统计软件与为美国人口普查数据收集开发的地图数据库相结合,将有助于流行病学研究。我们讨论了利用TIGER数据库与S进行绘图所需的技术背景。讨论了两种类型的S地图,基于线段的地图和基于多边形的地图,以及从TIGER数据构建它们的方法。基于多边形的地图对于显示区域统计数据很有用,例如人口普查区层面的疾病发病率或发病率。基于线段的地图更容易组装,并且在数据未进行区域划分时适用。使用旧金山艾滋病发病率的人口普查区数据以及康特拉科斯塔县肺癌病例位置相对于石化炼油厂的位置数据来说明将TIGER数据库与S接口的方法和潜在用途。