Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1 Gustave L Levy Pl, New York, NY, 10029, USA.
Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr, MS8312, Gaithersburg, MD, 20899, USA.
Sci Data. 2019 Jun 14;6(1):91. doi: 10.1038/s41597-019-0098-2.
Single-molecule long-read sequencing datasets were generated for a son-father-mother trio of Han Chinese descent that is part of the Genome in a Bottle (GIAB) consortium portfolio. The dataset was generated using the Pacific Biosciences Sequel System. The son and each parent were sequenced to an average coverage of 60 and 30, respectively, with N50 subread lengths between 16 and 18 kb. Raw reads and reads aligned to both the GRCh37 and GRCh38 are available at the NCBI GIAB ftp site (ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/ChineseTrio/). The GRCh38 aligned read data are archived in NCBI SRA (SRX4739017, SRX4739121, and SRX4739122). This dataset is available for anyone to develop and evaluate long-read bioinformatics methods.
三代单分子长读测序数据集来自一个祖籍为汉族的一家三口,他们是基因组学入门(GIAB)联盟组合的一部分。该数据集是使用太平洋生物科学公司的 Sequel 系统生成的。儿子和每位父母的测序平均覆盖率分别为 60 和 30,N50 子读数长度在 16 到 18kb 之间。原始读数和比对到 GRCh37 和 GRCh38 的读数可在 NCBI GIAB ftp 站点(ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/ChineseTrio/)获得。GRCh38 比对的读数据已归档在 NCBI SRA(SRX4739017、SRX4739121 和 SRX4739122)中。这个数据集可供任何人开发和评估长读生物信息学方法。