EMBL-Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, United Kingdom.
Gigascience. 2012 Jul 12;1(1):2. doi: 10.1186/2047-217X-1-2.
Archives operating under the International Nucleotide Sequence Database Collaboration currently preserve all submitted sequences equally, but rapid increases in the rate of global sequence production will soon require differentiated treatment of DNA sequences submitted for archiving. Here, we propose a graded system in which the ease of reproduction of a sequencing-based experiment and the relative availability of a sample for resequencing define the level of lossy compression applied to stored data.
目前,在国际核苷酸序列数据库协作下运行的档案库平等地保存所有提交的序列,但全球序列产生率的快速增长将很快要求对提交存档的 DNA 序列进行差异化处理。在这里,我们提出了一个分级系统,其中基于测序的实验的可重复性和样本的可重新测序的相对可用性定义了应用于存储数据的有损压缩的级别。