Suppr超能文献

HPCGCN:一种使用图卷积网络的高性能计算集群日志数据预测框架。

HPCGCN: A Predictive Framework on High Performance Computing Cluster Log Data Using Graph Convolutional Networks.

作者信息

Bose Avishek, Yang Huichen, Hsu William H, Andresen Daniel

机构信息

Department of Computer Science, Kansas State University, Manhattan, Kansas, USA.

出版信息

Proc IEEE Int Conf Big Data. 2021 Dec;2021:4113-4118. doi: 10.1109/bigdata52589.2021.9671370. Epub 2022 Jan 13.

Abstract

This paper presents a novel use case of Graph Convolutional Network (GCN) learning representations for predictive data mining, specifically from user/task data in the domain of high-performance computing (HPC). It outlines an approach based on a coalesced data set: logs from the Slurm workload manager, joined with user experience survey data from computational cluster users. We introduce a new method of constructing a heterogeneous unweighted HPC graph consisting of multiple typed nodes after revealing the manifold relations between the nodes. The GCN structure used here supports two tasks: i) determining whether a job will complete or fail and ii) predicting memory and CPU requirements by training the GCN semi-supervised classification model and regression models on the generated graph. The graph is partitioned into partitions using graph clustering. We conducted classification and regression experiments using the proposed framework on our HPC log dataset and evaluated predictions by our trained models against baselines using test_score, F1-score, precision, recall for classification, and R1 score for regression, showing that our framework achieves significant improvements.

摘要

本文提出了图卷积网络(GCN)学习表示用于预测性数据挖掘的一种新用例,具体是针对高性能计算(HPC)领域中的用户/任务数据。它概述了一种基于合并数据集的方法:来自Slurm工作负载管理器的日志,与来自计算集群用户的用户体验调查数据相结合。在揭示节点之间的流形关系后,我们引入了一种构建由多个类型化节点组成的异构无加权HPC图的新方法。这里使用的GCN结构支持两项任务:i)确定作业将完成还是失败;ii)通过在生成的图上训练GCN半监督分类模型和回归模型来预测内存和CPU需求。使用图聚类将图划分为多个分区。我们在HPC日志数据集上使用所提出的框架进行了分类和回归实验,并使用测试分数、F1分数、精度、召回率(用于分类)和R1分数(用于回归),针对基线评估了我们训练模型的预测结果,结果表明我们的框架取得了显著改进。

相似文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验