Lviv Polytechnic National University, Lviv, 79013, Ukraine.
Sci Rep. 2024 Jun 5;14(1):12947. doi: 10.1038/s41598-024-61776-y.
The modern development of healthcare is characterized by a set of large volumes of tabular data for monitoring and diagnosing the patient's condition. In addition, modern methods of data engineering allow the synthesizing of a large number of features from an image or signals, which are presented in tabular form. The possibility of high-precision and high-speed processing of such large volumes of medical data requires the use of artificial intelligence tools. A linear machine learning model cannot accurately analyze such data, and traditional bagging, boosting, or stacking ensembles typically require significant computing power and time to implement. In this paper, the authors proposed a method for the analysis of large sets of medical data, based on a designed linear ensemble method with a non-iterative learning algorithm. The basic node of the new ensemble is an extended-input SGTM neural-like structure, which provides high-speed data processing at each level of the ensemble. Increasing prediction accuracy is ensured by dividing the large dataset into parts, the analysis of which is carried out in each node of the ensemble structure and taking into account the output signal from the previous level of the ensemble as an additional attribute on the next one. Such a design of a new ensemble structure provides both a significant increase in the prediction accuracy for large sets of medical data analysis and a significant reduction in the duration of the training procedure. Experimental studies on a large medical dataset, as well as a comparison with existing machine learning methods, confirmed the high efficiency of using the developed ensemble structure when solving the prediction task.
医疗保健的现代发展的特点是一组用于监测和诊断患者病情的大量表格数据。此外,现代数据工程方法允许从图像或信号中综合大量特征,这些特征以表格形式呈现。处理如此大量医疗数据的高精度和高速处理的可能性需要使用人工智能工具。线性机器学习模型无法准确分析此类数据,而传统的装袋、提升或堆叠集成通常需要大量的计算能力和时间来实现。在本文中,作者提出了一种基于具有非迭代学习算法的设计线性集成方法的大型医疗数据集分析方法。新集成的基本节点是扩展输入 SGTM 类脑结构,它在集成的每个级别提供高速数据处理。通过将大数据集划分为多个部分来提高预测精度,在集成结构的每个节点中分析这些部分,并考虑将来自集成前一级的输出信号作为下一级的附加属性,从而确保了预测精度的提高。这种新集成结构的设计既提高了对大型医疗数据分析的预测精度,又显著缩短了训练过程的持续时间。对大型医疗数据集的实验研究以及与现有机器学习方法的比较证实了在解决预测任务时使用开发的集成结构的高效率。