• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于物理模式的空气生物学数据集插补新方法。

A new method based on physical patterns to impute aerobiological datasets.

机构信息

Unit of Epidemiology and Medical Statistics, Department of Diagnostics and Public Health, University of Verona, Verona, Italy.

School of Aerospace Engineering, Universidad Politécnica de Madrid, Madrid, Spain.

出版信息

PLoS One. 2024 Nov 19;19(11):e0314005. doi: 10.1371/journal.pone.0314005. eCollection 2024.

DOI:10.1371/journal.pone.0314005
PMID:39561200
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11575811/
Abstract

Limited research has assessed the accuracy of imputation methods in aerobiological datasets. We conducted a simulation study to evaluate, for the first time, the effectiveness of Gappy Singular Value Decomposition (GSVD), a data-driven approach, comparing it with the moving mean interpolation, a statistical approach. Utilizing complete pollen data from two monitoring stations in northeastern Italy for 2022, we randomly generated missing data considering the combination of various proportions (5%, 10%, 25%) and gap lengths (3, 5, 7, 10 days). We imputed 4800 time series using the GSVD algorithm, specifically implemented for this study, and the moving mean algorithm of the "AeRobiology" R package. We assessed imputation accuracy by calculating the Root Mean Square Error and employed multiple linear regression models to identify factors independently affecting the error (e.g. pollen variability, simulation settings). The results showed that the GSVD was as good as the well-established moving mean method and demonstrated its strong generalization capabilities across different data types. However, the imputation error was primarily influenced by pollen characteristics and location, regardless of the imputation method used. High variability in pollen concentrations and the distribution of missing data negatively affected imputation accuracy. In conclusion, we introduced and tested a novel imputation method, demonstrating comparable performance to the statistical approach in aerobiological data reconstruction. These findings contribute to advancing aerobiological data analysis, highlighting the need for improving imputation methods.

摘要

有限的研究评估了在气传花粉数据集中文献插补方法的准确性。我们进行了一项模拟研究,首次评估了数据驱动方法——广义奇异值分解(GSVD)的有效性,将其与统计方法——移动均值插值法进行了比较。利用 2022 年意大利东北部两个监测站的完整花粉数据,我们考虑了各种比例(5%、10%、25%)和缺口长度(3、5、7、10 天)的组合,随机生成缺失数据。我们使用特定于本研究的 GSVD 算法和“气传生物学”R 包的移动均值算法,对 4800 个时间序列进行了插补。我们通过计算均方根误差来评估插补准确性,并使用多元线性回归模型来确定独立影响误差的因素(例如花粉变异性、模拟设置)。结果表明,GSVD 与成熟的移动均值方法一样好,并且在不同数据类型中表现出很强的泛化能力。然而,插补误差主要受到花粉特征和位置的影响,而与所使用的插补方法无关。花粉浓度的高度变异性和缺失数据的分布对插补准确性有负面影响。总之,我们介绍并测试了一种新的插补方法,该方法在气传花粉数据重构方面的性能与统计方法相当。这些发现有助于推进气传花粉数据分析,突出了改进插补方法的必要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c3f/11575811/486613f32aae/pone.0314005.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c3f/11575811/2148538ab7a8/pone.0314005.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c3f/11575811/383cb769aab9/pone.0314005.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c3f/11575811/5db576352e5f/pone.0314005.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c3f/11575811/ba0dd38641ed/pone.0314005.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c3f/11575811/486613f32aae/pone.0314005.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c3f/11575811/2148538ab7a8/pone.0314005.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c3f/11575811/383cb769aab9/pone.0314005.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c3f/11575811/5db576352e5f/pone.0314005.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c3f/11575811/ba0dd38641ed/pone.0314005.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c3f/11575811/486613f32aae/pone.0314005.g005.jpg

相似文献

1
A new method based on physical patterns to impute aerobiological datasets.基于物理模式的空气生物学数据集插补新方法。
PLoS One. 2024 Nov 19;19(11):e0314005. doi: 10.1371/journal.pone.0314005. eCollection 2024.
2
Methods for interpolating missing data in aerobiological databases.大气生物学数据库中缺失数据的插补方法。
Environ Res. 2021 Sep;200:111391. doi: 10.1016/j.envres.2021.111391. Epub 2021 May 28.
3
Advanced methods for missing values imputation based on similarity learning.基于相似性学习的缺失值插补先进方法。
PeerJ Comput Sci. 2021 Jul 21;7:e619. doi: 10.7717/peerj-cs.619. eCollection 2021.
4
Missing value imputation for microarray data: a comprehensive comparison study and a web tool.微阵列数据的缺失值插补:一项综合比较研究及网络工具
BMC Syst Biol. 2013;7 Suppl 6(Suppl 6):S12. doi: 10.1186/1752-0509-7-S6-S12. Epub 2013 Dec 13.
5
Multiple imputation with sequential penalized regression.多重插补与序贯惩罚回归。
Stat Methods Med Res. 2019 May;28(5):1311-1327. doi: 10.1177/0962280218755574. Epub 2018 Feb 16.
6
Evaluating Imputation Algorithms for Low-Depth Genotyping-By-Sequencing (GBS) Data.评估低深度简化基因组测序(GBS)数据的插补算法
PLoS One. 2016 Aug 18;11(8):e0160733. doi: 10.1371/journal.pone.0160733. eCollection 2016.
7
Collateral missing value imputation: a new robust missing value estimation algorithm for microarray data.并行缺失值插补:一种用于微阵列数据的新型稳健缺失值估计算法。
Bioinformatics. 2005 May 15;21(10):2417-23. doi: 10.1093/bioinformatics/bti345. Epub 2005 Feb 24.
8
A wide range of missing imputation approaches in longitudinal data: a simulation study and real data analysis.多种缺失值插补方法在纵向数据分析中的应用:一项模拟研究与真实数据分析。
BMC Med Res Methodol. 2023 Jul 6;23(1):161. doi: 10.1186/s12874-023-01968-8.
9
Selection of statistical technique for imputation of single site-univariate and multisite-multivariate methods for particulate pollutants time series data with long gaps and high missing percentage.单站点单变量和多站点多变量方法在长时间间隔和高缺失率的颗粒物污染物时间序列数据插补中的统计技术选择。
Environ Sci Pollut Res Int. 2023 Jun;30(30):75469-75488. doi: 10.1007/s11356-023-27659-x. Epub 2023 May 23.
10
Spatial imputation for air pollutants data sets via low rank matrix completion algorithm.基于低秩矩阵补全算法的大气污染物数据集的空间插补。
Environ Int. 2020 Jun;139:105713. doi: 10.1016/j.envint.2020.105713. Epub 2020 Apr 11.

本文引用的文献

1
A temporally and spatially explicit, data-driven estimation of airborne ragweed pollen concentrations across Europe.一项对欧洲各地空气中豚草花粉浓度进行时空明确、数据驱动的估计。
Sci Total Environ. 2023 Dec 20;905:167095. doi: 10.1016/j.scitotenv.2023.167095. Epub 2023 Sep 23.
2
Estimation of historical daily airborne pollen concentrations across Switzerland using a spatio temporal random forest model.利用时空随机森林模型估算瑞士历史上的每日气传花粉浓度。
Sci Total Environ. 2024 Jan 1;906:167286. doi: 10.1016/j.scitotenv.2023.167286. Epub 2023 Sep 22.
3
Integration of reference data from different Rapid-E devices supports automatic pollen detection in more locations.
整合来自不同 Rapid-E 设备的参考数据可支持在更多地点进行自动花粉检测。
Sci Total Environ. 2022 Dec 10;851(Pt 2):158234. doi: 10.1016/j.scitotenv.2022.158234. Epub 2022 Aug 23.
4
50 Years of Pollen Monitoring in Basel (Switzerland) Demonstrate the Influence of Climate Change on Airborne Pollen.瑞士巴塞尔50年的花粉监测表明气候变化对空气中花粉的影响。
Front Allergy. 2021 May 28;2:677159. doi: 10.3389/falgy.2021.677159. eCollection 2021.
5
Data mining assessment of Poaceae pollen influencing factors and its environmental implications.数据挖掘评估禾本科花粉影响因素及其环境意义。
Sci Total Environ. 2022 Apr 1;815:152874. doi: 10.1016/j.scitotenv.2021.152874. Epub 2022 Jan 6.
6
Aerobiological modeling I: A review of predictive models.大气生物学建模 I:预测模型综述。
Sci Total Environ. 2021 Nov 15;795:148783. doi: 10.1016/j.scitotenv.2021.148783. Epub 2021 Jul 2.
7
Methods for interpolating missing data in aerobiological databases.大气生物学数据库中缺失数据的插补方法。
Environ Res. 2021 Sep;200:111391. doi: 10.1016/j.envres.2021.111391. Epub 2021 May 28.
8
A systematic review of the effects of temperature and precipitation on pollen concentrations and season timing, and implications for human health.温度和降水对花粉浓度和季节时间的影响的系统评价,及其对人类健康的影响。
Int J Biometeorol. 2021 Oct;65(10):1615-1628. doi: 10.1007/s00484-021-02128-7. Epub 2021 Apr 20.
9
Towards a model of wet deposition of bioaerosols: The raindrop size role.迈向生物气溶胶湿沉积模型:雨滴大小的作用。
Sci Total Environ. 2021 May 1;767:145426. doi: 10.1016/j.scitotenv.2021.145426. Epub 2021 Jan 27.
10
Pollen concentrations and prevalence of asthma and allergic rhinitis in Italy: Evidence from the GEIRD study.意大利的花粉浓度与哮喘和过敏性鼻炎的发病情况:来自 GEIRD 研究的证据。
Sci Total Environ. 2017 Apr 15;584-585:1093-1099. doi: 10.1016/j.scitotenv.2017.01.168. Epub 2017 Feb 4.