Suppr超能文献

针对存在缺失值和低于阈值测量值的多变量数据的多重填补:北极地区污染物的时间序列浓度

Multiple imputation for multivariate data with missing and below-threshold measurements: time-series concentrations of pollutants in the Arctic.

作者信息

Hopke P K, Liu C, Rubin D B

机构信息

Department of Chemistry, Clarkson University, Potsdam, New York 13699, USA.

出版信息

Biometrics. 2001 Mar;57(1):22-33. doi: 10.1111/j.0006-341x.2001.00022.x.

Abstract

Many chemical and environmental data sets are complicated by the existence of fully missing values or censored values known to lie below detection thresholds. For example, week-long samples of airborne particulate matter were obtained at Alert, NWT, Canada, between 1980 and 1991, where some of the concentrations of 24 particulate constituents were coarsened in the sense of being either fully missing or below detection limits. To facilitate scientific analysis, it is appealing to create complete data by filling in missing values so that standard complete-data methods can be applied. We briefly review commonly used strategies for handling missing values and focus on the multiple-imputation approach, which generally leads to valid inferences when faced with missing data. Three statistical models are developed for multiply imputing the missing values of airborne particulate matter. We expect that these models are useful for creating multiple imputations in a variety of incomplete multivariate time series data sets.

摘要

许多化学和环境数据集因存在完全缺失值或已知低于检测阈值的截尾值而变得复杂。例如,1980年至1991年期间在加拿大西北地区的阿勒特采集了为期一周的空气传播颗粒物样本,其中24种颗粒物成分的一些浓度在完全缺失或低于检测限的意义上被粗略化了。为便于科学分析,通过填补缺失值来创建完整数据很有吸引力,这样就可以应用标准的完整数据方法。我们简要回顾了处理缺失值的常用策略,并重点关注多重填补方法,当面对缺失数据时,该方法通常能得出有效的推断。开发了三种统计模型来多重填补空气传播颗粒物的缺失值。我们期望这些模型可用于在各种不完整的多元时间序列数据集中创建多重填补。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验