Bioinformatics and Systems Biology, Justus-Liebig-University Giessen, Giessen, Germany.
Department of Computer and Information Sciences, University of Delaware, Newark, Delaware.
Biotechnol Bioeng. 2018 Aug;115(8):2087-2100. doi: 10.1002/bit.26722. Epub 2018 May 29.
Accurate and complete genome sequences are essential in biotechnology to facilitate genome-based cell engineering efforts. The current genome assemblies for Cricetulus griseus, the Chinese hamster, are fragmented and replete with gap sequences and misassemblies, consistent with most short-read-based assemblies. Here, we completely resequenced C. griseus using single molecule real time sequencing and merged this with Illumina-based assemblies. This generated a more contiguous and complete genome assembly than either technology alone, reducing the number of scaffolds by >28-fold, with 90% of the sequence in the 122 longest scaffolds. Most genes are now found in single scaffolds, including up- and downstream regulatory elements, enabling improved study of noncoding regions. With >95% of the gap sequence filled, important Chinese hamster ovary cell mutations have been detected in draft assembly gaps. This new assembly will be an invaluable resource for continued basic and pharmaceutical research.
准确和完整的基因组序列对于生物技术至关重要,有助于基于基因组的细胞工程。中国仓鼠的当前基因组组装是碎片化的,充满了缺口序列和错误组装,这与大多数基于短读长的组装一致。在这里,我们使用单分子实时测序完全重新测序了 C. griseus,并将其与基于 Illumina 的组装合并。这比单独使用任何一种技术都产生了更连续和完整的基因组组装,减少了支架数量 >28 倍,122 个最长支架中的 90%序列。现在大多数基因都在单个支架中找到,包括上游和下游调节元件,从而可以更好地研究非编码区域。随着 >95%的缺口序列被填补,在草案组装缺口处检测到了重要的中国仓鼠卵巢细胞突变。这个新的组装将是继续进行基础和药物研究的宝贵资源。