Suppr超能文献

序列流:用于可视化偏序比对的交互式网络应用程序。

Sequence Flow: interactive web application for visualizing partial order alignments.

机构信息

Institute of Informatics, University of Warsaw, Banacha 2, Warszawa, 02-097, Poland.

出版信息

BMC Genomics. 2024 Oct 16;25(1):973. doi: 10.1186/s12864-024-10886-y.

Abstract

BACKGROUND

Multiple sequence alignment (MSA) has proven extremely useful in computational biology, especially in inferring evolutionary relationships via phylogenetic analysis and providing insight into protein structure and function. An alternative to the standard MSA model is partial order alignment (POA), in which aligned sequences are represented as paths in a graph rather than rows in a matrix. While the POA model has proven useful in several applications (e.g. sequencing reads assembly and pangenome structure exploration), we lack efficient visualization tools that could highlight its advantages.

RESULTS

We propose Sequence Flow - a web application designed to address the above problem. Sequence Flow presents the POA as a Sankey diagram, a kind of graph visualisation typically used for graphs representing flowcharts. Sequence Flow enables interactive alignment exploration, including fragment selection, highlighting a selected group of sequences, modification of the position of graph nodes, structure simplification etc. After adjustment, the visualization can be saved as a high-quality graphic file. Thanks to the use of SanKEY.js - a JavaScript library for creating Sankey diagrams, designed specifically to visualize POAs, Sequence Flow provides satisfactory performance even with large alignments.

CONCLUSIONS

We provide Sankey diagram-based POA visualization tools for both end users (Sequence Flow) and bioinformatic software developers (SanKEY.js). Sequence Flow webservice is available at https://sequenceflow.mimuw.edu.pl/ . The source code for SanKEY.js is available at https://github.com/Krzysiekzd/SanKEY.js and for Sequence Flow at https://github.com/Krzysiekzd/SequenceFlow .

摘要

背景

多序列比对 (MSA) 在计算生物学中已被证明非常有用,特别是在通过系统发育分析推断进化关系,并深入了解蛋白质结构和功能方面。标准 MSA 模型的替代方法是部分有序比对 (POA),其中对齐的序列表示为图中的路径,而不是矩阵中的行。虽然 POA 模型在几个应用程序中已被证明是有用的(例如测序读段组装和泛基因组结构探索),但我们缺乏能够突出其优势的高效可视化工具。

结果

我们提出了 Sequence Flow——一种旨在解决上述问题的网络应用程序。Sequence Flow 将 POA 表示为 Sankey 图,这是一种通常用于表示流程图的图形可视化类型。Sequence Flow 支持交互式对齐探索,包括片段选择、突出显示选定的序列组、修改图形节点的位置、简化结构等。调整后,可视化可以保存为高质量的图形文件。由于使用了 SanKEY.js——一种用于创建 Sankey 图的 JavaScript 库,专门用于可视化 POA,因此 Sequence Flow 甚至可以在处理大型对齐时提供令人满意的性能。

结论

我们为最终用户(Sequence Flow)和生物信息学软件开发人员(SanKEY.js)提供了基于 Sankey 图的 POA 可视化工具。Sequence Flow 网络服务可在 https://sequenceflow.mimuw.edu.pl/ 访问。SanKEY.js 的源代码可在 https://github.com/Krzysiekzd/SanKEY.js 获得,而 Sequence Flow 的源代码可在 https://github.com/Krzysiekzd/SequenceFlow 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f67/11483981/4b41138c9795/12864_2024_10886_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验