Wu Yan, Xie Xiaojun, Zhu Jihong, Guan Lixin, Li Mengshan
School of Mathematics and Computer Science, Gannan Normal University, Ganzhou 341000, China.
Int J Mol Sci. 2025 Jan 8;26(2):477. doi: 10.3390/ijms26020477.
Due to advances in big data technology, deep learning, and knowledge engineering, biological sequence visualization has been extensively explored. In the post-genome era, biological sequence visualization enables the visual representation of both structured and unstructured biological sequence data. However, a universal visualization method for all types of sequences has not been reported. Biological sequence data are rapidly expanding exponentially and the acquisition, extraction, fusion, and inference of knowledge from biological sequences are critical supporting technologies for visualization research. These areas are important and require in-depth exploration. This paper elaborates on a comprehensive overview of visualization methods for DNA sequences from four different perspectives-two-dimensional, three-dimensional, four-dimensional, and dynamic visualization approaches-and discusses the strengths and limitations of each method in detail. Furthermore, this paper proposes two potential future research directions for biological sequence visualization in response to the challenges of inefficient graphical feature extraction and knowledge association network generation in existing methods. The first direction is the construction of knowledge graphs for biological sequence big data, and the second direction is the cross-modal visualization of biological sequences using machine learning methods. This review is anticipated to provide valuable insights and contributions to computational biology, bioinformatics, genomic computing, genetic breeding, evolutionary analysis, and other related disciplines in the fields of biology, medicine, chemistry, statistics, and computing. It has an important reference value in biological sequence recommendation systems and knowledge question answering systems.
由于大数据技术、深度学习和知识工程的进步,生物序列可视化已得到广泛探索。在后基因组时代,生物序列可视化能够对结构化和非结构化生物序列数据进行可视化表示。然而,尚未有针对所有类型序列的通用可视化方法的报道。生物序列数据正呈指数级快速增长,从生物序列中获取、提取、融合和推断知识是可视化研究的关键支撑技术。这些领域很重要,需要深入探索。本文从二维、三维、四维和动态可视化方法这四个不同角度详细阐述了DNA序列可视化方法的全面概述,并详细讨论了每种方法的优缺点。此外,针对现有方法中图形特征提取效率低下和知识关联网络生成方面的挑战,本文提出了生物序列可视化未来两个潜在的研究方向。第一个方向是构建生物序列大数据的知识图谱,第二个方向是使用机器学习方法对生物序列进行跨模态可视化。预计这篇综述将为生物学、医学、化学、统计学和计算领域的计算生物学、生物信息学、基因组计算、遗传育种、进化分析及其他相关学科提供有价值的见解和贡献。它在生物序列推荐系统和知识问答系统中具有重要的参考价值。