使用流水线式Spark和高性能计算进行实时肺炎预测。

Real-time pneumonia prediction using pipelined spark and high-performance computing.

作者信息

Ravikumar Aswathy, Sriraman Harini

机构信息

School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, Tamil Nadu, India.

出版信息

PeerJ Comput Sci. 2023 Mar 9;9:e1258. doi: 10.7717/peerj-cs.1258. eCollection 2023.

DOI:10.7717/peerj-cs.1258

PMID:37346542

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10280684/

Abstract

BACKGROUND

Pneumonia is a respiratory disease caused by bacteria; it affects many people, particularly in impoverished countries where pollution, unclean living standards, overpopulation, and insufficient medical infrastructures are prevalent. To guarantee curative therapy and boost survival chances, it is vital to detect pneumonia soon enough. Imaging using chest X-rays is the most common way of detecting pneumonia. However, analyzing chest X-rays is a complex process vulnerable to subjective variation. Moreover, the data available is growing exponentially, and it will take hours and days to train the model to predict pneumonia. Timely prediction is significant to guarantee a better cure and treatment. Existing work provided by different authors needs more precision, and the computation time for predicting pneumonia is also much longer. Therefore, there is a requirement for early forecasting. Using X-ray picture samples, the system must have a continuous and unsupervised learning system for early diagnosis.

METHODS

In this article, the training time of the model is accelerated using the distributed data-parallel approach and the computational power of high-performance computing devices. This research aims to diagnose pneumonia using X-ray pictures with more precision, greater speed, and fewer processing resources. Distributed deep learning techniques are gaining popularity owing to the rising need for computational resources for deep learning models with several parameters. In contrast to conventional training methods, data-parallel training enables several compute nodes to train massive deep-learning models to improve training efficiency concurrently. Deploying the model in Spark solves the scalability and acceleration. Spark's distributed processing capability reads data from multiple nodes, and the results demonstrate that training time can be drastically reduced by utilizing these techniques, which is a significant necessity when dealing with large datasets.

RESULTS

The proposed model makes the prediction 1.5 times faster than the traditional CNN model used for pneumonia prediction. The model also achieved an accuracy of 98.72%. The speed-up varying from 1.2 to 1.5 was obtained in the synchronous and asynchronous parallel model. The speed-up is reduced in the parallel asynchronous model due to the presence of straggler nodes.

摘要

背景

肺炎是一种由细菌引起的呼吸道疾病；它影响着许多人，尤其是在贫困国家，那里污染严重、生活环境不卫生、人口过剩且医疗基础设施不足。为了确保有效的治疗并提高生存几率，尽早检测出肺炎至关重要。使用胸部X光进行成像检查是检测肺炎最常用的方法。然而，分析胸部X光片是一个复杂的过程，容易受到主观差异的影响。此外，可用数据呈指数级增长，训练模型来预测肺炎需要数小时甚至数天时间。及时预测对于确保更好的治疗效果具有重要意义。不同作者开展的现有工作需要更高的精度，且预测肺炎的计算时间也长得多。因此，有必要进行早期预测。利用X光图像样本，该系统必须具备一个用于早期诊断的持续且无监督的学习系统。

方法

在本文中，使用分布式数据并行方法和高性能计算设备的计算能力来加速模型的训练时间。本研究旨在利用X光图像更精确、更快且以更少的处理资源来诊断肺炎。由于具有多个参数的深度学习模型对计算资源的需求不断增加，分布式深度学习技术越来越受欢迎。与传统训练方法相比，数据并行训练使多个计算节点能够同时训练大规模深度学习模型，从而提高训练效率。在Spark中部署模型可解决可扩展性和加速问题。Spark的分布式处理能力可从多个节点读取数据，结果表明利用这些技术可大幅减少训练时间，这在处理大型数据集时非常必要。

结果

所提出的模型进行预测的速度比用于肺炎预测的传统卷积神经网络（CNN）模型快1.5倍。该模型还实现了98.72%的准确率。在同步和异步并行模型中获得了1.2至1.5的加速比。由于存在掉队节点，并行异步模型中的加速比有所降低。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1da5/10280684/68e00b60d9f6/peerj-cs-09-1258-g001.jpg

相似文献

Real-time pneumonia prediction using pipelined spark and high-performance computing.

PeerJ Comput Sci. 2023 Mar 9;9:e1258. doi: 10.7717/peerj-cs.1258. eCollection 2023.

Pneumonia detection in chest X-ray images using an ensemble of deep learning models.

PLoS One. 2021 Sep 7;16(9):e0256630. doi: 10.1371/journal.pone.0256630. eCollection 2021.

DPro-SM - A distributed framework for proactive straggler mitigation using LSTM.

Heliyon. 2023 Dec 10;10(1):e23567. doi: 10.1016/j.heliyon.2023.e23567. eCollection 2024 Jan 15.

Truncating a densely connected convolutional neural network with partial layer freezing and feature fusion for diagnosing COVID-19 from chest X-rays.

MethodsX. 2021;8:101408. doi: 10.1016/j.mex.2021.101408. Epub 2021 Jun 5.

A deep learning approach using effective preprocessing techniques to detect COVID-19 from chest CT-scan and X-ray images.

Comput Biol Med. 2021 Dec;139:105014. doi: 10.1016/j.compbiomed.2021.105014. Epub 2021 Nov 4.

Automated detection of pneumonia cases using deep transfer learning with paediatric chest X-ray images.

Br J Radiol. 2021 May 1;94(1121):20201263. doi: 10.1259/bjr.20201263. Epub 2021 Apr 16.

Performance Evaluation of the Deep Learning Based Convolutional Neural Network Approach for the Recognition of Chest X-Ray Images.

Front Oncol. 2022 Jun 29;12:932496. doi: 10.3389/fonc.2022.932496. eCollection 2022.

An Efficient Method to Predict Pneumonia from Chest X-Rays Using Deep Learning Approach.

Stud Health Technol Inform. 2020 Jun 26;272:457-460. doi: 10.3233/SHTI200594.

Novel Privacy Preserving Non-Invasive Sensing-Based Diagnoses of Pneumonia Disease Leveraging Deep Network Model.

Sensors (Basel). 2022 Jan 8;22(2):461. doi: 10.3390/s22020461.

Pneumonia Detection Using Enhanced Convolutional Neural Network Model on Chest X-Ray Images.

Big Data. 2025 Feb;13(1):16-29. doi: 10.1089/big.2022.0261. Epub 2023 Apr 17.

引用本文的文献

DPro-SM - A distributed framework for proactive straggler mitigation using LSTM.

Heliyon. 2023 Dec 10;10(1):e23567. doi: 10.1016/j.heliyon.2023.e23567. eCollection 2024 Jan 15.

Health Care Equity Through Intelligent Edge Computing and Augmented Reality/Virtual Reality: A Systematic Review.

J Multidiscip Healthc. 2023 Sep 21;16:2839-2859. doi: 10.2147/JMDH.S419923. eCollection 2023.

本文引用的文献

Pneumonia Transfer Learning Deep Learning Model from Segmented X-rays.

Healthcare (Basel). 2022 May 26;10(6):987. doi: 10.3390/healthcare10060987.

Federated Learning for Privacy Preservation in Smart Healthcare Systems: A Comprehensive Survey.

IEEE J Biomed Health Inform. 2023 Feb;27(2):778-789. doi: 10.1109/JBHI.2022.3181823. Epub 2023 Feb 3.

Effect of neural network structure in accelerating performance and accuracy of a convolutional neural network with GPU/TPU for image analytics.

PeerJ Comput Sci. 2022 Mar 3;8:e909. doi: 10.7717/peerj-cs.909. eCollection 2022.

Medical image augmentation for lesion detection using a texture-constrained multichannel progressive GAN.

Comput Biol Med. 2022 Jun;145:105444. doi: 10.1016/j.compbiomed.2022.105444. Epub 2022 Mar 30.

A parallel attention-augmented bilinear network for early magnetic resonance imaging-based diagnosis of Alzheimer's disease.

Hum Brain Mapp. 2022 Feb 1;43(2):760-772. doi: 10.1002/hbm.25685. Epub 2021 Oct 22.

An Efficient Deep Learning Approach to Pneumonia Classification in Healthcare.

J Healthc Eng. 2019 Mar 27;2019:4180949. doi: 10.1155/2019/4180949. eCollection 2019.

Not-so-supervised: A survey of semi-supervised, multi-instance, and transfer learning in medical image analysis.

Med Image Anal. 2019 May;54:280-296. doi: 10.1016/j.media.2019.03.009. Epub 2019 Mar 29.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用流水线式Spark和高性能计算进行实时肺炎预测。

Real-time pneumonia prediction using pipelined spark and high-performance computing.

作者信息

Ravikumar Aswathy, Sriraman Harini

机构信息

School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, Tamil Nadu, India.