五个模式生物的长读、全基因组鸟枪法测序数据。

Long-read, whole-genome shotgun sequence data for five model organisms.

机构信息

Pacific Biosciences of California Inc. , 1380 Willow Road, Menlo Park, California 94025, USA.

Flinders University, School of Biological Sciences , PO Box 2100, Adelaide, South Australia 5001, Australia.

出版信息

Sci Data. 2014 Nov 25;1:140045. doi: 10.1038/sdata.2014.45. eCollection 2014.

DOI:10.1038/sdata.2014.45

PMID:25977796

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4365909/

Abstract

Single molecule, real-time (SMRT) sequencing from Pacific Biosciences is increasingly used in many areas of biological research including de novo genome assembly, structural-variant identification, haplotype phasing, mRNA isoform discovery, and base-modification analyses. High-quality, public datasets of SMRT sequences can spur development of analytic tools that can accommodate unique characteristics of SMRT data (long read lengths, lack of GC or amplification bias, and a random error profile leading to high consensus accuracy). In this paper, we describe eight high-coverage SMRT sequence datasets from five organisms (Escherichia coli, Saccharomyces cerevisiae, Neurospora crassa, Arabidopsis thaliana, and Drosophila melanogaster) that have been publicly released to the general scientific community (NCBI Sequence Read Archive ID SRP040522). Data were generated using two sequencing chemistries (P4C2 and P5C3) on the PacBio RS II instrument. The datasets reported here can be used without restriction by the research community to generate whole-genome assemblies, test new algorithms, investigate genome structure and evolution, and identify base modifications in some of the most widely-studied model systems in biological research.

摘要

太平洋生物科学公司的单分子实时（SMRT）测序技术在包括从头基因组组装、结构变异识别、单倍型相位、mRNA 异构体发现和碱基修饰分析在内的许多生物学研究领域得到了越来越多的应用。高质量的公共 SMRT 序列数据集可以促进分析工具的发展，这些工具可以适应 SMRT 数据的独特特征（长读取长度、缺乏 GC 或扩增偏差，以及导致高一致性准确性的随机错误分布）。在本文中，我们描述了来自五个生物体（大肠杆菌、酿酒酵母、粗糙脉孢菌、拟南芥和黑腹果蝇）的八个高覆盖率 SMRT 序列数据集，这些数据集已向广大科学界公开（NCBI 序列读取档案 ID SRP040522）。这些数据是使用 PacBio RS II 仪器上的两种测序化学物质（P4C2 和 P5C3）生成的。本报告中所报道的数据集可供研究界在无需限制的情况下使用，以生成全基因组组装、测试新算法、研究基因组结构和进化，并识别在生物学研究中最广泛研究的一些模型系统中的碱基修饰。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e89/4365909/59c28f3e4e03/sdata201445-f1.jpg

相似文献

Long-read, whole-genome shotgun sequence data for five model organisms.五个模式生物的长读、全基因组鸟枪法测序数据。

Sci Data. 2014 Nov 25;1:140045. doi: 10.1038/sdata.2014.45. eCollection 2014.

Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.利用单分子测序和局部敏感哈希组装大型基因组。

Nat Biotechnol. 2015 Jun;33(6):623-30. doi: 10.1038/nbt.3238. Epub 2015 May 25.

Efficient and accurate whole genome assembly and methylome profiling of E. coli.高效准确的大肠杆菌全基因组组装和甲基组分析。

BMC Genomics. 2013 Oct 3;14(1):675. doi: 10.1186/1471-2164-14-675.

Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.非杂交、基于长读长 SMRT 测序数据的完成微生物基因组组装。

Nat Methods. 2013 Jun;10(6):563-9. doi: 10.1038/nmeth.2474. Epub 2013 May 5.

HISEA: HIerarchical SEed Aligner for PacBio data.HISEA：用于PacBio数据的分层种子比对器。

BMC Bioinformatics. 2017 Dec 19;18(1):564. doi: 10.1186/s12859-017-1953-9.

De novo yeast genome assemblies from MinION, PacBio and MiSeq platforms.基于 MinION、PacBio 和 MiSeq 平台的从头酵母基因组组装。

Sci Rep. 2017 Jun 21;7(1):3935. doi: 10.1038/s41598-017-03996-z.

Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.使用单分子实时（SMRT）技术的长读长测序仪进行基因组测序在医学领域的优势。

Hum Cell. 2017 Jul;30(3):149-161. doi: 10.1007/s13577-017-0168-8. Epub 2017 Mar 31.

SMRT sequencing only de novo assembly of the sugar beet (Beta vulgaris) chloroplast genome.甜菜（Beta vulgaris）叶绿体基因组的单分子实时测序从头组装

BMC Bioinformatics. 2015 Sep 16;16(1):295. doi: 10.1186/s12859-015-0726-6.

An evaluation of the PacBio RS platform for sequencing and de novo assembly of a chloroplast genome.评价 PacBio RS 平台在叶绿体基因组测序和从头组装方面的应用。

BMC Genomics. 2013 Oct 1;14:670. doi: 10.1186/1471-2164-14-670.

A High-Quality Genome Assembly from a Single Mosquito Using PacBio Sequencing.利用 PacBio 测序从单个蚊子中获得高质量基因组组装。

Genes (Basel). 2019 Jan 18;10(1):62. doi: 10.3390/genes10010062.

引用本文的文献

Exploring the Anti-Alzheimer's Disease Potential of C23-3 Through Genomic Insights, Metabolomic Analysis, and Molecular Docking.通过基因组学洞察、代谢组学分析和分子对接探索C23-3的抗阿尔茨海默病潜力

J Fungi (Basel). 2025 Jul 23;11(8):546. doi: 10.3390/jof11080546.

Genetic variation in recalcitrant repetitive regions of the genome.基因组难处理的重复区域中的遗传变异。

Genome Res. 2025 Aug 5. doi: 10.1101/gr.280728.125.

Morphotype-Specific Antifungal Defense in Arises from Metabolic and Immune Network Restructuring.形态型特异性抗真菌防御源于代谢和免疫网络重构。

Insects. 2025 May 20;16(5):541. doi: 10.3390/insects16050541.

Mapping-based genome size estimation.基于图谱的基因组大小估计

BMC Genomics. 2025 May 14;26(1):482. doi: 10.1186/s12864-025-11640-8.

Fast noisy long read alignment with multi-level parallelism.基于多级并行的快速噪声长读比对

BMC Bioinformatics. 2025 May 2;26(1):118. doi: 10.1186/s12859-025-06129-w.

RAmbler resolves complex repeats in human Chromosomes 8, 19, and X.RAmbler解析人类8号、19号和X染色体中的复杂重复序列。

Genome Res. 2025 Apr 14;35(4):863-876. doi: 10.1101/gr.279308.124.

Unveiling axolotl transcriptome for tissue regeneration with high-resolution annotation via long-read sequencing.通过长读长测序进行高分辨率注释，揭示用于组织再生的蝾螈转录组。

Comput Struct Biotechnol J. 2024 Aug 21;23:3186-3198. doi: 10.1016/j.csbj.2024.08.014. eCollection 2024 Dec.

Genome sequencing of Elaeocarpus spp. stem blight pathogen Pseudocryphonectria elaeocarpicola reveals potential adaptations to colonize woody bark.对木荷茎枯病菌拟茎点霉基因组测序揭示其定植木质树皮的潜在适应性。

BMC Genomics. 2024 Jul 24;25(1):714. doi: 10.1186/s12864-024-10615-5.

Genetic variation in recalcitrant repetitive regions of the genome.基因组顽固重复区域的遗传变异。

bioRxiv. 2024 Jun 12:2024.06.11.598575. doi: 10.1101/2024.06.11.598575.

Whole-genome sequence of MLY92: isolation from diseased leaves of tobacco in China.MLY92的全基因组序列：从中国烟草病叶中分离得到。

Microbiol Resour Announc. 2024 Jul 18;13(7):e0017624. doi: 10.1128/mra.00176-24. Epub 2024 Jun 18.

本文引用的文献

Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.利用单分子测序和局部敏感哈希组装大型基因组。

Nat Biotechnol. 2015 Jun;33(6):623-30. doi: 10.1038/nbt.3238. Epub 2015 May 25.

Improved performance of the PacBio SMRT technology for 16S rDNA sequencing.用于16S rDNA测序的PacBio SMRT技术性能得到改善。

J Microbiol Methods. 2014 Sep;104:59-60. doi: 10.1016/j.mimet.2014.06.012. Epub 2014 Jun 27.

Defining a personal, allele-specific, and single-molecule long-read transcriptome.定义个人、等位基因特异性和单分子长读转录组。

Proc Natl Acad Sci U S A. 2014 Jul 8;111(27):9869-74. doi: 10.1073/pnas.1400447111. Epub 2014 Jun 24.

PBHoney: identifying genomic variants via long-read discordance and interrupted mapping.PBHoney：通过长读段不一致性和中断映射识别基因组变异体。

BMC Bioinformatics. 2014 Jun 10;15:180. doi: 10.1186/1471-2105-15-180.

Long-read sequencing of chicken transcripts and identification of new transcript isoforms.鸡转录本的长读长测序及新转录本异构体的鉴定。

PLoS One. 2014 Apr 15;9(4):e94650. doi: 10.1371/journal.pone.0094650. eCollection 2014.

Comparison of single-molecule sequencing and hybrid approaches for finishing the genome of Clostridium autoethanogenum and analysis of CRISPR systems in industrial relevant Clostridia.比较单分子测序和杂交方法对产乙醇梭菌基因组的完成，并分析工业相关梭菌中的 CRISPR 系统。

Biotechnol Biofuels. 2014 Mar 21;7:40. doi: 10.1186/1754-6834-7-40. eCollection 2014.

The reference genome sequence of Saccharomyces cerevisiae: then and now.酿酒酵母的参考基因组序列：过去与现在。

G3 (Bethesda). 2014 Mar 20;4(3):389-98. doi: 10.1534/g3.113.008995.

Global methylation state at base-pair resolution of the Caulobacter genome throughout the cell cycle.在整个细胞周期中，对钙杆菌基因组进行碱基对分辨率的全局甲基化状态分析。

Proc Natl Acad Sci U S A. 2013 Nov 26;110(48):E4658-67. doi: 10.1073/pnas.1319315110. Epub 2013 Nov 11.

Nuclease-mediated gene editing by homologous recombination of the human globin locus.通过同源重组介导的人珠蛋白基因座的核酸酶基因编辑。

Nucleic Acids Res. 2014 Jan;42(2):1365-78. doi: 10.1093/nar/gkt947. Epub 2013 Oct 23.

Reducing assembly complexity of microbial genomes with single-molecule sequencing.利用单分子测序降低微生物基因组的组装复杂性

Genome Biol. 2013;14(9):R101. doi: 10.1186/gb-2013-14-9-r101.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

五个模式生物的长读、全基因组鸟枪法测序数据。

Long-read, whole-genome shotgun sequence data for five model organisms.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献