Tipu Abdul Jabbar Saeed, Conbhuí Padraig Ó, Howley Enda
School of Computer Science, National University of Ireland Galway, Galway, Ireland.
Irish Centre for High-End Computing, Dublin, Ireland.
Cluster Comput. 2022;25(4):2661-2682. doi: 10.1007/s10586-021-03347-8. Epub 2021 Jul 2.
HPC or super-computing clusters are designed for executing computationally intensive operations that typically involve large scale I/O operations. This most commonly involves using a standard MPI library implemented in C/C++. The MPI-I/O performance in HPC clusters tends to vary significantly over a range of configuration parameters that are generally not taken into account by the algorithm. It is commonly left to individual practitioners to optimise I/O on a case by case basis at code level. This can often lead to a range of unforeseen outcomes. The ExSeisDat utility is built on top of the native MPI-I/O library comprising of Parallel I/O and Workflow Libraries to process seismic data encapsulated in SEG-Y file format. The SEG-Y File data structure is complex in nature, due to the alternative arrangement of trace header and trace data. Its size scales to petabytes and the chances of I/O performance degradation are further increased by ExSeisDat. This research paper presents a novel study of the changing I/O performance in terms of bandwidth, with the use of parallel plots against various MPI-I/O, Lustre (Parallel) File System and SEG-Y File parameters. Another novel aspect of this research is the predictive modelling of MPI-I/O behaviour over SEG-Y File benchmarks using Artificial Neural Networks (ANNs). The accuracy ranges from 62.5% to 96.5% over the set of trained ANN models. The computed Mean Square Error (MSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) values further support the generalisation of the prediction models. This paper demonstrates that by using our ANNs prediction technique, the configurations can be tuned beforehand to avoid poor I/O performance.
高性能计算(HPC)集群或超级计算集群旨在执行计算密集型操作,这些操作通常涉及大规模输入/输出(I/O)操作。这最常见的是使用用C/C++实现的标准消息传递接口(MPI)库。HPC集群中的MPI-I/O性能在一系列配置参数范围内往往会有显著差异,而算法通常不会考虑这些参数。通常由个体从业者在代码级别逐案优化I/O。这往往会导致一系列不可预见的结果。ExSeisDat实用程序基于原生MPI-I/O库构建,该库由并行I/O和工作流库组成,用于处理封装在SEG-Y文件格式中的地震数据。由于道头和道数据的交替排列,SEG-Y文件数据结构本质上很复杂。其大小可扩展到PB级,而ExSeisDat进一步增加了I/O性能下降的可能性。本文通过使用针对各种MPI-I/O、Lustre(并行)文件系统和SEG-Y文件参数的并行图,对带宽方面不断变化的I/O性能进行了新颖的研究。本研究的另一个新颖之处是使用人工神经网络(ANN)对SEG-Y文件基准上的MPI-I/O行为进行预测建模。在一组经过训练的ANN模型中,准确率范围为62.5%至96.5%。计算得到的均方误差(MSE)、平均绝对误差(MAE)和平均绝对百分比误差(MAPE)值进一步支持了预测模型的泛化。本文表明,通过使用我们的ANN预测技术,可以预先调整配置以避免较差的I/O性能。