Department of Biomedical Engineering, National Cheng Kung University, No. 1, University Road, Tainan 701, Taiwan; Medical Device Innovation Center, National Cheng Kung University, Tainan City 701, Taiwan.
Department of Information Management, National University of Kaohsiung, Kaohsiung University Rd, 811 Kaohsiung, Taiwan.
Comput Biol Chem. 2023 Oct;106:107929. doi: 10.1016/j.compbiolchem.2023.107929. Epub 2023 Jul 22.
Identifying lowly prevalent diseases, or rare diseases, in their early stages is key to disease treatment in the medical field. Deep learning techniques now provide promising tools for this purpose. Nevertheless, the low prevalence of rare diseases entangles the proper application of deep networks for disease identification due to the severe class-imbalance issue. In the past decades, some balancing methods have been studied to handle the data-imbalance issue. The bad news is that it is verified that none of these methods guarantees superior performance to others. This performance variation causes the need to formulate a systematic pipeline with a comprehensive software tool for enhancing deep-learning applications in rare disease identification. We reviewed the existing balancing schemes and summarized a systematic deep ensemble pipeline with a constructed tool called RDDL for handling the data imbalance issue. Through two real case studies, we showed that rare disease identification could be boosted with this systematic RDDL pipeline tool by lessening the data imbalance problem during model training. The RDDL pipeline tool is available at https://github.com/cobisLab/RDDL/.
在医学领域,早期识别低流行疾病(或罕见病)是疾病治疗的关键。深度学习技术现在为此提供了有前途的工具。然而,由于严重的类别不平衡问题,罕见病的低流行率使得正确应用深度网络进行疾病识别变得复杂。在过去的几十年中,已经研究了一些平衡方法来处理数据不平衡问题。坏消息是,没有一种方法被证明优于其他方法。这种性能差异导致需要制定一个系统的管道,并使用全面的软件工具来增强深度学习在罕见病识别中的应用。我们回顾了现有的平衡方案,并总结了一个系统的深度集成管道,以及一个名为 RDDL 的构建工具,用于处理数据不平衡问题。通过两个实际案例研究,我们表明,通过在模型训练过程中减轻数据不平衡问题,这个系统的 RDDL 管道工具可以提高罕见病的识别能力。RDDL 管道工具可在 https://github.com/cobisLab/RDDL/ 获得。