Suppr超能文献

基因组选择实施的基准数据库系统。

Benchmarking database systems for Genomic Selection implementation.

机构信息

Institute of Biotechnology, Cornell University.

Boyce Thompson Institute.

出版信息

Database (Oxford). 2019 Jan 1;2019. doi: 10.1093/database/baz096.

Abstract

MOTIVATION

With high-throughput genotyping systems now available, it has become feasible to fully integrate genotyping information into breeding programs. To make use of this information effectively requires DNA extraction facilities and marker production facilities that can efficiently deploy the desired set of markers across samples with a rapid turnaround time that allows for selection before crosses needed to be made. In reality, breeders often have a short window of time to make decisions by the time they are able to collect all their phenotyping data and receive corresponding genotyping data. This presents a challenge to organize information and utilize it in downstream analyses to support decisions made by breeders. In order to implement genomic selection routinely as part of breeding programs, one would need an efficient genotyping data storage system. We selected and benchmarked six popular open-source data storage systems, including relational database management and columnar storage systems.

RESULTS

We found that data extract times are greatly influenced by the orientation in which genotype data is stored in a system. HDF5 consistently performed best, in part because it can more efficiently work with both orientations of the allele matrix.

AVAILABILITY

http://gobiin1.bti.cornell.edu:6083/projects/GBM/repos/benchmarking/browse.

摘要

动机

随着高通量基因分型系统的出现,将基因分型信息完全整合到育种计划中已成为可能。为了有效利用这些信息,需要具备 DNA 提取设施和标记生产设施,这些设施能够在快速周转时间内高效地在样本中部署所需的标记集,以便在需要进行杂交之前进行选择。实际上,育种者通常只有很短的时间窗口来做出决策,直到他们能够收集所有的表型数据并收到相应的基因分型数据。这给组织信息并在下游分析中利用这些信息来支持育种者做出的决策带来了挑战。为了将基因组选择常规地作为育种计划的一部分实施,人们需要一个高效的基因分型数据存储系统。我们选择并基准测试了六个流行的开源数据存储系统,包括关系型数据库管理系统和列式存储系统。

结果

我们发现,数据提取时间极大地受到系统中基因分型数据存储方向的影响。HDF5 始终表现最佳,部分原因是它可以更有效地处理等位基因矩阵的两种方向。

可用性

http://gobiin1.bti.cornell.edu:6083/projects/GBM/repos/benchmarking/browse.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6e1/6737464/3057b7a7c278/baz096f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验