Suppr超能文献

用 BioHDF(HDF5)标准化下一代生物信息学软件开发。

Standardizing the next generation of bioinformatics software development with BioHDF (HDF5).

机构信息

Geospiza Inc, Seattle, WA 98119, USA.

出版信息

Adv Exp Med Biol. 2010;680:693-700. doi: 10.1007/978-1-4419-5913-3_77.

Abstract

Next Generation Sequencing technologies are limited by the lack of standard bioinformatics infrastructures that can reduce data storage, increase data processing performance, and integrate diverse information. HDF technologies address these requirements and have a long history of use in data-intensive science communities. They include general data file formats, libraries, and tools for working with the data. Compared to emerging standards, such as the SAM/BAM formats, HDF5-based systems demonstrate significantly better scalability, can support multiple indexes, store multiple data types, and are self-describing. For these reasons, HDF5 and its BioHDF extension are well suited for implementing data models to support the next generation of bioinformatics applications.

摘要

下一代测序技术受到缺乏标准生物信息学基础设施的限制,这些基础设施可以减少数据存储、提高数据处理性能并整合各种信息。HDF 技术满足了这些要求,并且在数据密集型科学社区中有着悠久的使用历史。它们包括通用数据文件格式、库以及用于处理数据的工具。与新兴标准(如 SAM/BAM 格式)相比,基于 HDF5 的系统具有明显更好的可扩展性,可以支持多个索引、存储多种数据类型,并且是自描述的。由于这些原因,HDF5 及其 BioHDF 扩展非常适合实现数据模型,以支持下一代生物信息学应用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验