Alejo-Sanchez Lizette Elena, Márquez-Grajales Aldo, Salas-Martínez Fernando, Franco-Arcega Anilu, López-Morales Virgilio, Acevedo-Sandoval Otilio Arturo, González-Ramírez César Abelardo, Villegas-Vega Ramiro
Área Académica de Computación y Electrónica, Instituto de Ciencias Básicas e Ingeniería, Universidad Autónoma del Estado de Hidalgo, Carr. Pachuca-Tulancingo km. 4.5, Mineral de la Reforma, 42184 Hidalgo, Mexico.
Área Académica de Química, Instituto de Ciencias Básicas e Ingeniería, Universidad Autónoma del Estado de Hidalgo, Carr. Pachuca-Tulancingo km. 4.5, Mineral de la Reforma, 42184 Hidalgo, Mexico.
MethodsX. 2025 Jun 19;15:103455. doi: 10.1016/j.mex.2025.103455. eCollection 2025 Dec.
Missing data in climate time series is a significant problem because it complicates the monitoring and prediction of climatic phenomena. The primary objective of this research document is to describe the most relevant imputation methods for missing data in the climate context over the last decade. Results reveal a superior concentration of documents on the use of imputation methods for climate time series in Asia and Europe, with notable examples from Malaysia, China, and Italy. Meanwhile, Brazil and Australia were the countries with a high number of research in America and Oceania. Moreover, temperature and precipitation were the most frequently employed climate variables. Regarding the information source, the monitoring networks were the most commonly used source for extracting data in almost all the research. On the other hand, methods such as mean techniques, simple and multiple linear regression, interpolation, and Principal Component Analysis (PCA) were the conventional statistical techniques used for imputing missing data. Furthermore, artificial neural networks demonstrated the ability to identify complex patterns in the data. Finally, Generative Adversarial Networks excel over other deep learning methods in the imputation of missing climate data.
气候时间序列中的数据缺失是一个重大问题,因为它使气候现象的监测和预测变得复杂。本研究文件的主要目的是描述过去十年中气候背景下缺失数据的最相关插补方法。结果显示,关于亚洲和欧洲气候时间序列插补方法使用的文献高度集中,马来西亚、中国和意大利有显著例子。同时,巴西和澳大利亚是美洲和大洋洲研究数量较多的国家。此外,温度和降水是最常使用的气候变量。关于信息来源,监测网络是几乎所有研究中最常用的数据提取来源。另一方面,均值技术、简单和多元线性回归、插值以及主成分分析(PCA)等方法是用于插补缺失数据的传统统计技术。此外,人工神经网络展示了识别数据中复杂模式的能力。最后,生成对抗网络在缺失气候数据的插补方面优于其他深度学习方法。