Al-Fakih Abdulrahman, Koeshidayatullah A, Mukerji Tapan, Al-Azani Sadam, Kaka SanLinn I
College of Petroleum Engineering and Geosciences, King Fahd University of Petroleum Minerals, 31261, Dhahran, Saudi Arabia.
Departments of Energy Science & Engineering, Earth & Planetary Sciences, and Geophysics, Stanford University, Stanford, CA, 94305, USA.
Sci Rep. 2025 Mar 31;15(1):11000. doi: 10.1038/s41598-025-95709-0.
Well log analysis is significant for hydrocarbon exploration, providing detailed insights into subsurface geological formations. However, gaps and inaccuracies in well log data, often due to equipment limitations, operational challenges, and harsh subsurface conditions, can introduce significant uncertainties in reservoir evaluation. Addressing these challenges requires effective methods for both synthetic data generation and precise imputation of missing data, ensuring data completeness and reliability. This study introduces a novel framework utilizing sequence-based generative adversarial networks (GANs) specifically designed for well log data generation and imputation. The framework integrates two distinct sequence-based GAN models: time series GAN (TSGAN) for generating synthetic well log data and sequence GAN (SeqGAN) for imputing missing data. Both models were tested on a dataset from the North Sea, Netherlands region. For the imputation task, the input comprises logs with missing values and the output is the corresponding imputed logs; for the synthetic data generation task, the input is complete real logs and the output is synthetic logs that mimic the statistical properties of the original data. All log measurements are normalized to a 0-1 range using min-max scaling, and error metrics are reported in these normalized units. Different sections of 5, 10, and 50 data points were used. Experimental results demonstrate that this approach achieves superior accuracy in filling data gaps compared to other deep learning models for spatial series analysis. The imputation method yielded [Formula: see text] values of 0.92, 0.86, and 0.57, with corresponding mean absolute percentage error (MAPE) values of 8.320, 0.005, and 166.6, and mean absolute error (MAE) values of 0.012, 0.002, and 0.03, respectively. The synthetic generation yielded [Formula: see text] of 0.92, MAE, of 0.35, and MRLE of 0.01. These results set a new benchmark for data integrity and utility in geosciences, particularly in well log data analysis.
测井分析对于油气勘探具有重要意义,它能提供有关地下地质构造的详细见解。然而,测井数据中的间隙和不准确之处,通常是由于设备限制、操作挑战以及恶劣的地下条件所致,这可能会在储层评价中引入重大的不确定性。应对这些挑战需要有效的合成数据生成方法和精确的缺失数据插补方法,以确保数据的完整性和可靠性。本研究引入了一种新颖的框架,该框架利用基于序列的生成对抗网络(GAN)专门用于测井数据生成和插补。该框架集成了两个不同的基于序列的GAN模型:用于生成合成测井数据的时间序列GAN(TSGAN)和用于插补缺失数据的序列GAN(SeqGAN)。两个模型均在来自荷兰北海地区的数据集上进行了测试。对于插补任务,输入包括具有缺失值的测井数据,输出是相应的插补后的测井数据;对于合成数据生成任务,输入是完整的真实测井数据,输出是模拟原始数据统计特性的合成测井数据。所有测井测量值都使用最小-最大缩放法归一化到0-1范围,并以这些归一化单位报告误差指标。使用了5、10和50个数据点的不同部分。实验结果表明,与其他用于空间序列分析的深度学习模型相比,该方法在填补数据间隙方面具有更高的准确性。插补方法产生的[公式:见原文]值分别为0.92、0.86和0.57,相应的平均绝对百分比误差(MAPE)值分别为8.320、0.005和166.6,平均绝对误差(MAE)值分别为0.012、0.002和0.03。合成生成产生的[公式:见原文]为0.92,MAE为0.35,MRLE为0.01。这些结果为地球科学中的数据完整性和实用性设定了新的基准,特别是在测井数据分析方面。