Mohammed Mazin Abed, Abdulkareem Karrar Hameed, Dinar Ahmed M, Zapirain Begonya Garcia
College of Computer Science and Information Technology, University of Anbar, Anbar 31001, Iraq.
eVIDA Lab, University of Deusto, 48007 Bilbao, Spain.
Diagnostics (Basel). 2023 Feb 10;13(4):664. doi: 10.3390/diagnostics13040664.
This research aims to review and evaluate the most relevant scientific studies about deep learning (DL) models in the omics field. It also aims to realize the potential of DL techniques in omics data analysis fully by demonstrating this potential and identifying the key challenges that must be addressed. Numerous elements are essential for comprehending numerous studies by surveying the existing literature. For example, the clinical applications and datasets from the literature are essential elements. The published literature highlights the difficulties encountered by other researchers. In addition to looking for other studies, such as guidelines, comparative studies, and review papers, a systematic approach is used to search all relevant publications on omics and DL using different keyword variants. From 2018 to 2022, the search procedure was conducted on four Internet search engines: IEEE Xplore, Web of Science, ScienceDirect, and PubMed. These indexes were chosen because they offer enough coverage and linkages to numerous papers in the biological field. A total of 65 articles were added to the final list. The inclusion and exclusion criteria were specified. Of the 65 publications, 42 are clinical applications of DL in omics data. Furthermore, 16 out of 65 articles comprised the review publications based on single- and multi-omics data from the proposed taxonomy. Finally, only a small number of articles (7/65) were included in papers focusing on comparative analysis and guidelines. The use of DL in studying omics data presented several obstacles related to DL itself, preprocessing procedures, datasets, model validation, and testbed applications. Numerous relevant investigations were performed to address these issues. Unlike other review papers, our study distinctly reflects different observations on omics with DL model areas. We believe that the result of this study can be a useful guideline for practitioners who look for a comprehensive view of the role of DL in omics data analysis.
本研究旨在回顾和评估组学领域中与深度学习(DL)模型最相关的科学研究。它还旨在通过展示深度学习技术在组学数据分析中的潜力并识别必须解决的关键挑战,来充分实现其潜力。通过调查现有文献,众多要素对于理解众多研究至关重要。例如,文献中的临床应用和数据集是关键要素。已发表的文献突出了其他研究人员所遇到的困难。除了查找其他研究,如指南、比较研究和综述论文外,还采用系统方法使用不同的关键词变体搜索所有关于组学和深度学习的相关出版物。在2018年至2022年期间,在四个互联网搜索引擎上进行了搜索程序:IEEE Xplore、科学网、ScienceDirect和PubMed。选择这些索引是因为它们提供了足够的覆盖面,并与生物领域的众多论文有链接。共有65篇文章被列入最终列表。明确了纳入和排除标准。在这65篇出版物中,42篇是深度学习在组学数据中的临床应用。此外,65篇文章中有16篇是基于所提出分类法的单组学和多组学数据的综述出版物。最后,专注于比较分析和指南的论文中仅纳入了少数文章(7/65)。在研究组学数据中使用深度学习存在与深度学习本身、预处理程序、数据集、模型验证和试验台应用相关的若干障碍。为解决这些问题进行了许多相关研究。与其他综述论文不同,我们的研究清晰地反映了对深度学习模型在组学领域不同方面的不同观察结果。我们相信,本研究结果可为寻求全面了解深度学习在组学数据分析中作用的从业者提供有用的指导。