Gicic Adaleta, Đonko Dženana, Subasi Abdulhamit
Faculty of Electrical Engineering, University of Sarajevo, 71000 Sarajevo, Bosnia and Herzegovina.
Institute of Biomedicine, Faculty of Medicine, University of Turku, 20520 Turku, Finland.
Entropy (Basel). 2024 Sep 12;26(9):783. doi: 10.3390/e26090783.
Although deep learning (DL) algorithms have been proved to be effective in diverse research domains, their application in developing models for tabular data remains limited. Models trained on tabular data demonstrate higher efficacy using traditional machine learning models than DL models, which are largely attributed to the size and structure of tabular datasets and the specific application contexts in which they are utilized. Thus, the primary objective of this paper is to propose a method to use the supremacy of Stacked Bidirectional LSTM (Long Short-Term Memory) deep learning algorithms in pattern discovery incorporating tabular data with customized 3D tensor modeling in feeding neural networks. Our findings are empirically validated using six diverse, publicly available datasets each varying in size and learning objectives. This paper proves that the proposed model based on time-sequence DL algorithms, which were generally described as inadequate when dealing with tabular data, yields satisfactory results and competes effectively with other algorithms specifically designed for tabular data. An additional benefit of this approach is its ability to preserve simplicity while ensuring fast model training also with large datasets. Even with extremely small datasets, models can be applied to achieve exceptional predictive results and fully utilize their capacity.
尽管深度学习(DL)算法已被证明在不同的研究领域中是有效的,但其在开发表格数据模型方面的应用仍然有限。在表格数据上训练的模型使用传统机器学习模型比DL模型表现出更高的功效,这在很大程度上归因于表格数据集的大小和结构以及它们所使用的特定应用环境。因此,本文的主要目标是提出一种方法,利用堆叠双向长短期记忆(LSTM)深度学习算法在模式发现方面的优势,将表格数据与定制的3D张量建模相结合,输入神经网络。我们的研究结果通过六个不同的、公开可用的数据集进行了实证验证,每个数据集在大小和学习目标上各不相同。本文证明,基于时间序列DL算法的所提出的模型,虽然在处理表格数据时通常被认为不足,但产生了令人满意的结果,并能有效地与专门为表格数据设计的其他算法竞争。这种方法的另一个优点是,它能够在确保快速模型训练(即使是处理大型数据集)的同时保持简单性。即使使用极小的数据集,模型也可以应用以实现出色的预测结果并充分利用其能力。