Leyk Stefan, Uhl Johannes H, Balk Deborah, Jones Bryan
University of Colorado Boulder, Department of Geography, Boulder, CO 80309, U.S.A.
City University of New York, Institute for Demographic Research and Baruch College, New York, NY 10010, U.S.A.
Remote Sens Environ. 2018 Jan;204:898-917. doi: 10.1016/j.rse.2017.08.035. Epub 2017 Oct 7.
Global data on settlements, built-up land and population distributions are becoming increasingly available and represent important inputs to a better understanding of key demographic processes such as urbanization and interactions between human and natural systems over time. One persistent drawback that prevents user communities from effectively and objectively using these data products more broadly, is the absence of thorough and transparent validation studies. This study develops a validation framework for accuracy assessment of multi-temporal built-up land layers using integrated public parcel and building records as validation data. The framework is based on measures derived from confusion matrices and incorporates a sensitivity analysis for potential spatial offsets between validation and test data as well as tests for the effects of varying criteria of the abstract term built-up land on accuracy measures. Furthermore, the framework allows for accuracy assessments by strata of built-up density, which provides important insights on the relationship between classification accuracy and development intensity to better instruct and educate user communities on quality aspects that might be relevant to different purposes. We use data from the newly-released Global Human Settlement Layer (GHSL), for four epochs since 1975 and at fine spatial resolution (38m), in the United States for a demonstration of the framework. The results show very encouraging accuracy measures that vary across study areas, generally improve over time but show very distinct patterns across the rural-urban trajectories. Areas of higher development intensity are very accurately classified and highly reliable. Rural areas show low degrees of accuracy, which could be affected by misalignment between the reference data and the data under test in areas where built-up land is scattered and rare. However, a regression analysis, which examines how well GHSL can estimate built-up land using spatially aggregated analytical units, indicates that classification error is mainly of thematic nature. Thus, caution should be taken in using the data product in rural regions. The results can be useful in further improving classification procedures to create measures of the built environment. The validation framework can be extended to data-poor regions of the world using map data and Volunteered Geographic Information.
关于定居点、建设用地和人口分布的全球数据越来越容易获取,这些数据是深入理解城市化等关键人口过程以及人类与自然系统随时间相互作用的重要依据。一个长期存在的问题阻碍了用户群体更广泛、有效且客观地使用这些数据产品,即缺乏全面且透明的验证研究。本研究开发了一个验证框架,用于以整合的公共地块和建筑记录作为验证数据,对多时期建设用地图层进行精度评估。该框架基于从混淆矩阵得出的指标,并纳入了对验证数据与测试数据之间潜在空间偏移的敏感性分析,以及对建设用地这一抽象术语不同标准对精度指标影响的测试。此外,该框架允许按建设用地密度分层进行精度评估,这为分类精度与开发强度之间的关系提供了重要见解,以便更好地指导和教育用户群体了解可能与不同用途相关的质量方面。我们使用自1975年以来四个时期、精细空间分辨率(38米)的最新发布的全球人类住区层(GHSL)数据,在美国进行该框架的演示。结果显示出非常令人鼓舞的精度指标,这些指标因研究区域而异,总体上随时间有所改善,但在城乡发展轨迹上呈现出非常不同的模式。开发强度较高的区域分类非常准确且高度可靠。农村地区的精度较低,这可能是由于在建设用地分散且稀少的地区,参考数据与测试数据之间存在偏差所致。然而,一项回归分析(该分析考察了GHSL使用空间聚合分析单元估算建设用地的能力)表明,分类误差主要是主题性的。因此,在农村地区使用该数据产品时应谨慎。这些结果有助于进一步改进分类程序,以创建建成环境的测量指标。该验证框架可利用地图数据和 volunteered地理信息扩展到世界上数据匮乏的地区。