School of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000, China.
School of Mathematics and Computer Science, Gannan Normal University, Ganzhou 341000, China.
Biomolecules. 2024 Nov 14;14(11):1447. doi: 10.3390/biom14111447.
The efficient analysis and interpretation of biological sequence data remain major challenges in bioinformatics. Graphical representation, as an emerging and effective visualization technique, offers a more intuitive method for analyzing DNA sequences. However, many visualization approaches are dispersed across research databases, requiring urgent organization, integration, and analysis. Additionally, no single visualization method excels in all aspects. To advance these methods, knowledge graphs and advanced machine learning techniques have become key areas of exploration. This paper reviews the current 2D and 3D DNA sequence visualization methods and proposes a new research direction focused on constructing knowledge graphs for biological sequence visualization, explaining the relevant theories, techniques, and models involved. Additionally, we summarize machine learning techniques applicable to sequence visualization, such as graph embedding methods and the use of convolutional neural networks (CNNs) for processing graphical representations. These machine learning techniques and knowledge graphs aim to provide valuable insights into computational biology, bioinformatics, genomic computing, and evolutionary analysis. The study serves as an important reference for improving intelligent search systems, enriching knowledge bases, and enhancing query systems related to biological sequence visualization, offering a comprehensive framework for future research.
生物序列数据的高效分析和解释仍然是生物信息学的主要挑战。图形表示作为一种新兴的有效可视化技术,为分析 DNA 序列提供了更直观的方法。然而,许多可视化方法分散在研究数据库中,需要紧急组织、整合和分析。此外,没有一种单一的可视化方法在所有方面都表现出色。为了推进这些方法,知识图谱和先进的机器学习技术已成为探索的关键领域。本文综述了当前的 2D 和 3D DNA 序列可视化方法,并提出了一个新的研究方向,即专注于构建生物序列可视化的知识图谱,解释相关的理论、技术和模型。此外,我们总结了适用于序列可视化的机器学习技术,如图嵌入方法和卷积神经网络(CNN)在图形表示处理方面的应用。这些机器学习技术和知识图谱旨在为计算生物学、生物信息学、基因组计算和进化分析提供有价值的见解。该研究为改进智能搜索系统、丰富知识库以及增强与生物序列可视化相关的查询系统提供了重要参考,为未来的研究提供了全面的框架。