Singh Rahul, Roy Bholanath
Maulana Azad National Institute of Technology, Bhopal, India.
Sci Rep. 2025 Jun 2;15(1):19265. doi: 10.1038/s41598-025-00804-x.
Earthquake magnitude prediction is critical for natural calamity prevention and mitigation, significantly reducing casualties and economic losses through timely warnings. This study introduces a novel approach by using spatio-temporal data from seismic records obtained from the Indian government seismology department and weather data sourced via VisualCrossing to predict earthquake magnitudes. By integrating environmental and seismic variables, the study explores their interrelationships to enhance predictive capabilities. The proposed framework incorporates a machine learning operations (MLOps)-driven pipeline using MLflow for automated data ingestion, preprocessing, model versioning, tracking, and deployment. This novel integration ensures adaptability to evolving datasets and facilitates dynamic model selection for optimal performance. Multiple machine learning algorithms, including Gradient Boosting, Light Gradient Boosting Machine (LightGBM), XGBoost, and Random Forest, are evaluated on dataset sizes of 20%, 35%, 65%, and 100%, with performance metrics such as Mean Absolute Error, Mean Squared Error, Root Mean Squared Error, and R. The results reveal that Gradient Boosting performs optimally on smaller datasets, while LightGBM demonstrates superior accuracy with larger datasets, showcasing the pipeline's flexibility and scalability. This research presents a scalable, robust, and resilient solution for earthquake magnitude prediction by combining diverse data sources with a dynamic and operational MLOps framework. The outcomes illustrate the potential of integrating advanced machine learning techniques with lifecycle management practices to enhance prediction accuracy and applicability in real-world seismic scenarios.
地震震级预测对于自然灾害的预防和减轻至关重要,通过及时预警可显著减少人员伤亡和经济损失。本研究引入了一种新方法,利用从印度政府地震部门获取的地震记录中的时空数据以及通过VisualCrossing获取的天气数据来预测地震震级。通过整合环境和地震变量,该研究探索它们之间的相互关系以提高预测能力。所提出的框架包含一个由机器学习操作(MLOps)驱动的管道,使用MLflow进行自动数据摄取、预处理、模型版本控制、跟踪和部署。这种新颖的整合确保了对不断演变的数据集的适应性,并便于进行动态模型选择以实现最佳性能。在20%、3