Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong.
Bioinformatics. 2014 May 15;30(10):1467-8. doi: 10.1093/bioinformatics/btu038. Epub 2014 Jan 24.
We have implemented ECplot, an online tool for plotting charts from large datasets. This tool supports a variety of chart types commonly used in bioinformatics publications. In our benchmarking, it was able to create a Box-and-Whisker plot with about 67 000 data points and 8 MB total file size within several seconds. The design of the tool makes common formatting operations easy to perform. It also allows more complex operations to be achieved by advanced XML (Extensible Markup Language) and programming options. Data and formatting styles are stored in separate files, such that style templates can be made and applied to new datasets. The text-based file formats based on XML facilitate efficient manipulation of formatting styles for a large number of data series. These file formats also provide a means to reproduce published figures from raw data, which complement parallel efforts in making the data and software involved in published analysis results accessible. We demonstrate this idea by using ECplot to replicate some complex figures from a previous publication.
ECplot and its source code (under MIT license) are available at https://yiplab.cse.cuhk.edu.hk/ecplot/.
我们已经实现了 ECplot,这是一个用于绘制大数据集图表的在线工具。该工具支持生物信息学出版物中常用的各种图表类型。在我们的基准测试中,它能够在几秒钟内创建一个包含约 67000 个数据点和 8MB 总文件大小的箱线图。该工具的设计使得常见的格式化操作变得容易执行。它还通过高级 XML(可扩展标记语言)和编程选项来实现更复杂的操作。数据和格式样式存储在单独的文件中,这样就可以制作样式模板并将其应用于新数据集。基于 XML 的文本文件格式便于对大量数据系列的格式样式进行高效操作。这些文件格式还提供了一种从原始数据重现已发表图形的方法,这与使发表分析结果中涉及的数据和软件可访问的并行工作相辅相成。我们通过使用 ECplot 来复制以前发表的一些复杂图形来说明这个想法。
ECplot 及其源代码(MIT 许可证)可在 https://yiplab.cse.cuhk.edu.hk/ecplot/ 获得。