Department of Biotechnology, National Institute of Technology, Warangal, Telanga na, 506004, India.
Funct Integr Genomics. 2023 Apr 21;23(2):134. doi: 10.1007/s10142-023-01064-6.
In the last decade, transcriptome research adopting next-generation sequencing (NGS) technologies has gathered incredible momentum amongst functional genomics scientists, particularly amongst clinical/biomedical research groups. The progressive enfoldment/adoption of NGS technologies has incited an abundance of next-generation transcriptomic data harbouring an opulence of new knowledge in public databases. Nevertheless, knowledge discovery from these next-generation RNA-Seq. data analysis necessitates extensive bioinformatics know-how besides elaborate data analysis software packages consistent with the type and context of data analysis. Several reliability and reproducibility concerns continue to impede RNA-Seq. data analysis. Characteristic challenges comprise of data quality, hardware and networking provisions, selection and prioritisation of data analysis tools, and yet significantly implementing of robust machine learning algorithms for maximised exploitation of these experimental transcriptomic data. Over the years, numerous machine learning algorithms have been implemented for improved transcriptomic data analysis executing predominantly shallow learning approaches. More recently, deep learning algorithms are becoming more mainstream, and enactment for next-generation RNA-Seq. data analysis could be revolutionary in the coming years in the biomedical domain. In this scoping review, we attempt to determine the existing literature's size and potential nature in deep learning and NGS RNA-Seq. data analysis. An analysis of the contemporary topics of next-generation RNA-Seq. data analysis based on deep learning algorithms is critically reviewed, emphasising open-source resources.
在过去的十年中,采用下一代测序(NGS)技术的转录组研究在功能基因组学科学家,特别是临床/生物医学研究小组中获得了巨大的发展动力。NGS 技术的逐步采用激发了大量下一代转录组数据,这些数据在公共数据库中蕴藏着丰富的新知识。然而,从这些下一代 RNA-Seq 数据分析中发现新知识需要广泛的生物信息学知识,以及与数据分析的类型和背景一致的精心数据分析软件包。一些可靠性和可重复性问题继续阻碍 RNA-Seq 数据分析。典型的挑战包括数据质量、硬件和网络供应、数据分析工具的选择和优先级,以及为了最大程度地利用这些实验转录组数据而实施强大的机器学习算法。多年来,已经实施了许多机器学习算法来改进转录组数据分析,主要采用浅层学习方法。最近,深度学习算法变得更加主流,在未来几年,在生物医学领域,它们在下一代 RNA-Seq 数据分析中的应用可能具有革命性意义。在本次范围综述中,我们试图确定深度学习和 NGS RNA-Seq 数据分析领域现有文献的规模和潜在性质。我们批判性地回顾了基于深度学习算法的下一代 RNA-Seq 数据分析的当代主题分析,强调了开源资源。