University of Wisconsin-Madison, USA.
IEEE Trans Vis Comput Graph. 2011 Dec;17(12):2392-401. doi: 10.1109/TVCG.2011.232.
In this paper, we introduce overview visualization tools for large-scale multiple genome alignment data. Genome alignment visualization and, more generally, sequence alignment visualization are an important tool for understanding genomic sequence data. As sequencing techniques improve and more data become available, greater demand is being placed on visualization tools to scale to the size of these new datasets. When viewing such large data, we necessarily cannot convey details, rather we specifically design overview tools to help elucidate large-scale patterns. Perceptual science, signal processing theory, and generality provide a framework for the design of such visualizations that can scale well beyond current approaches. We present Sequence Surveyor, a prototype that embodies these ideas for scalable multiple whole-genome alignment overview visualization. Sequence Surveyor visualizes sequences in parallel, displaying data using variable color, position, and aggregation encodings. We demonstrate how perceptual science can inform the design of visualization techniques that remain visually manageable at scale and how signal processing concepts can inform aggregation schemes that highlight global trends, outliers, and overall data distributions as the problem scales. These techniques allow us to visualize alignments with over 100 whole bacterial-sized genomes.
本文介绍了用于大规模多基因组比对数据的概述可视化工具。基因组比对可视化,更一般地说,序列比对可视化,是理解基因组序列数据的重要工具。随着测序技术的改进和更多数据的出现,对可视化工具的需求也越来越大,需要这些工具能够扩展到这些新数据集的规模。当查看如此大的数据时,我们不可能传达细节,而是专门设计概述工具来帮助阐明大规模模式。感知科学、信号处理理论和通用性为设计可以很好扩展的此类可视化提供了框架。我们提出了 Sequence Surveyor,这是一个体现了这些可扩展多全基因组比对概述可视化思想的原型。Sequence Surveyor 以并行方式可视化序列,使用可变颜色、位置和聚合编码显示数据。我们展示了感知科学如何为设计在规模上仍然具有视觉可管理性的可视化技术提供信息,以及信号处理概念如何为突出全局趋势、异常值和整体数据分布的聚合方案提供信息,随着问题的扩展。这些技术使我们能够可视化超过 100 个全细菌大小的基因组的比对。