Suppr超能文献

从单一个体的DNA中确定的二倍体人类基因组的完整阶段序列。

Fully Phased Sequence of a Diploid Human Genome Determined from the DNA of a Single Individual.

作者信息

Soifer Llya, Fong Nicole L, Yi Nelda, Ireland Andrea T, Lam Irene, Sooknah Matthew, Paw Jonathan S, Peluso Paul, Concepcion Gregory T, Rank David, Hastie Alex R, Jojic Vladimir, Ruby J Graham, Botstein David, Roy Margaret A

机构信息

Calico Life Sciences LLC, South San Francisco, CA 94080.

Pacific Biosciences, Menlo Park, CA 94025.

出版信息

G3 (Bethesda). 2020 Sep 2;10(9):2911-2925. doi: 10.1534/g3.119.400995.

Abstract

In recent years, improved sequencing technology and computational tools have made genome assembly more accessible. Many approaches, however, generate either an unphased or only partially resolved representation of a diploid genome, in which polymorphisms are detected but not assigned to one or the other of the homologous chromosomes. Yet chromosomal phase information is invaluable for the understanding of phenotypic trait inheritance in the cases of compound heterozygosity, allele-specific expression or -acting variants. Here we use a combination of tools and sequencing technologies to generate a diploid assembly of the human primary cell line WI-38. First, data from PacBio single molecule sequencing and Bionano Genomics optical mapping were combined to generate an unphased assembly. Next, 10x Genomics linked reads were combined with the hybrid assembly to generate a partially phased assembly. Lastly, we developed and optimized methods to use short-read (Illumina) sequencing of flow cytometry-sorted metaphase chromosomes to provide phase information. The final genome assembly was almost fully (94%) phased with the addition of approximately 2.5-fold coverage of Illumina data from the sequenced metaphase chromosomes. The diploid nature of the final genome assembly improved the resolution of structural variants between the WI-38 genome and the human reference genome. The phased WI-38 sequence data are available for browsing and download at wi38.research.calicolabs.com. Our work shows that assembling a completely phased diploid genome from the DNA of a single individual is now readily achievable.

摘要

近年来,测序技术和计算工具的改进使基因组组装变得更加容易。然而,许多方法生成的是二倍体基因组的非定相或仅部分解析的表示形式,其中多态性被检测到,但未分配到同源染色体中的一条或另一条上。然而,染色体相位信息对于理解复合杂合性、等位基因特异性表达或顺式作用变体情况下的表型性状遗传非常宝贵。在这里,我们使用多种工具和测序技术的组合来生成人类原代细胞系WI-38的二倍体组装。首先,将来自PacBio单分子测序和Bionano Genomics光学图谱的数据结合起来,生成一个非定相组装。接下来,将10x Genomics连接读数与混合组装结合起来,生成一个部分定相的组装。最后,我们开发并优化了方法,使用流式细胞仪分选的中期染色体的短读长(Illumina)测序来提供相位信息。通过添加来自测序中期染色体的Illumina数据的约2.5倍覆盖度,最终的基因组组装几乎完全(94%)定相。最终基因组组装的二倍体性质提高了WI-38基因组与人类参考基因组之间结构变异的分辨率。定相的WI-38序列数据可在wi38.research.calicolabs.com上浏览和下载。我们的工作表明,现在很容易从单个个体的DNA组装出一个完全定相的二倍体基因组。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验