Suppr超能文献

ARCHI:用于区域相关水文记录自动插补的一个新的R软件包。

ARCHI: A New R Package for Automated Imputation of Regionally Correlated Hydrologic Records.

作者信息

Levy Zeno F, Glas Robin L, Stagnitta Timothy J, Terry Neil

机构信息

U.S. Geological Survey, California Water Science Center, 6000 J Street, Placer Hall, Sacramento, CA, 95819, USA.

U.S. Geological Survey, New York Water Science Center, 425 Jordan Road, Troy, NY, 12180, USA.

出版信息

Ground Water. 2025 Jul-Aug;63(4):595-610. doi: 10.1111/gwat.13474. Epub 2025 Feb 28.

Abstract

Missing data in hydrological records can limit resource assessment, process understanding, and predictive modeling. Here, we present ARCHI (Automated Regional Correlation Analysis for Hydrologic Record Imputation), a new, open-source software package in R designed to aggregate, impute, cluster, and visualize regionally correlated hydrologic records. ARCHI imputes missing data in "target" records by linear regression using more complete "reference" records as predictors. Automated imputation is implemented using a novel, iterative algorithm that allows each site to be considered a target or reference for regression, growing the pool of complete references with each imputed record until viable gap-filling ceases. Users can limit artifacts from spurious correlations by specifying model-acceptance criteria and applying geospatial, correlation, and group-based filters to control reference selection. ARCHI provides additional functions for visualizing results, clustering records with similar correlation structures, evaluating holdout data, and interactive parameterization with an accessible and intuitive graphical user interface (GUI). This methods brief provides an overview of the ARCHI package, modeling guidelines, and benchmarking on two regional groundwater-level datasets from the Central Valley, CA and Long Island, NY. We evaluate ARCHI alongside widely used multivariate imputation software to highlight and contextualize its computational efficiency, imputation accuracy, and model transparency when applied to large, groundwater-level datasets.

摘要

水文记录中的缺失数据会限制资源评估、过程理解和预测建模。在此,我们介绍ARCHI(用于水文记录插补的自动区域相关性分析),这是一个用R语言编写的全新开源软件包,旨在汇总、插补、聚类和可视化区域相关的水文记录。ARCHI通过线性回归,使用更完整的“参考”记录作为预测变量来插补“目标”记录中的缺失数据。自动插补是通过一种新颖的迭代算法实现的,该算法允许将每个站点视为回归的目标或参考,随着每条插补记录的增加,完整参考记录的池也会扩大,直到无法进行有效的缺口填充为止。用户可以通过指定模型接受标准,并应用地理空间、相关性和基于组的过滤器来控制参考选择,从而减少虚假相关性带来的影响。ARCHI还提供了其他功能,用于可视化结果、对具有相似相关结构的记录进行聚类、评估留存数据以及通过一个易于使用且直观的图形用户界面(GUI)进行交互式参数化。本方法简报概述了ARCHI软件包、建模指南,并对来自加利福尼亚州中央谷地和纽约州长岛的两个区域地下水位数据集进行了基准测试。我们将ARCHI与广泛使用的多元插补软件一起进行评估,以突出并说明其在应用于大型地下水位数据集时的计算效率、插补准确性和模型透明度。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验