• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于图的二倍体基因组组装方法。

A graph-based approach to diploid genome assembly.

机构信息

Center for Bioinformatics, Saarland University, Saarland Informatics Campus E2.1, Saarbrücken, Germany.

Department of Computational Biology & Applied Algorithmics, Max Planck Institute for Informatics, Saarland Informatics Campus E1.4, Saarbrücken, Germany.

出版信息

Bioinformatics. 2018 Jul 1;34(13):i105-i114. doi: 10.1093/bioinformatics/bty279.

DOI:10.1093/bioinformatics/bty279
PMID:29949989
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6022571/
Abstract

MOTIVATION

Constructing high-quality haplotype-resolved de novo assemblies of diploid genomes is important for revealing the full extent of structural variation and its role in health and disease. Current assembly approaches often collapse the two sequences into one haploid consensus sequence and, therefore, fail to capture the diploid nature of the organism under study. Thus, building an assembler capable of producing accurate and complete diploid assemblies, while being resource-efficient with respect to sequencing costs, is a key challenge to be addressed by the bioinformatics community.

RESULTS

We present a novel graph-based approach to diploid assembly, which combines accurate Illumina data and long-read Pacific Biosciences (PacBio) data. We demonstrate the effectiveness of our method on a pseudo-diploid yeast genome and show that we require as little as 50× coverage Illumina data and 10× PacBio data to generate accurate and complete assemblies. Additionally, we show that our approach has the ability to detect and phase structural variants.

AVAILABILITY AND IMPLEMENTATION

https://github.com/whatshap/whatshap.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

构建高质量的单倍型解析从头组装的二倍体基因组对于揭示结构变异的全部范围及其在健康和疾病中的作用非常重要。目前的组装方法通常将两个序列折叠成一个单倍体共识序列,因此无法捕获研究中生物体的二倍体性质。因此,构建一个能够生成准确完整的二倍体组装的组装器,同时在测序成本方面具有资源效率,是生物信息学社区需要解决的关键挑战。

结果

我们提出了一种新的基于图的二倍体组装方法,该方法结合了准确的 Illumina 数据和长读长 Pacific Biosciences(PacBio)数据。我们在一个拟二倍体酵母基因组上证明了我们方法的有效性,并表明我们只需要 50×Illumina 数据覆盖度和 10×PacBio 数据即可生成准确完整的组装。此外,我们还表明,我们的方法具有检测和相位结构变异的能力。

可用性和实现

https://github.com/whatshap/whatshap。

补充信息

补充数据可在Bioinformatics 在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa6e/6022571/c5847e5dbd48/bty279f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa6e/6022571/ca397ea8fbcb/bty279f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa6e/6022571/1534bcd3efbc/bty279f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa6e/6022571/18912f7b17ba/bty279f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa6e/6022571/eaa656babcc5/bty279f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa6e/6022571/cf8996c7fae7/bty279f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa6e/6022571/c5847e5dbd48/bty279f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa6e/6022571/ca397ea8fbcb/bty279f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa6e/6022571/1534bcd3efbc/bty279f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa6e/6022571/18912f7b17ba/bty279f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa6e/6022571/eaa656babcc5/bty279f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa6e/6022571/cf8996c7fae7/bty279f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa6e/6022571/c5847e5dbd48/bty279f6.jpg

相似文献

1
A graph-based approach to diploid genome assembly.基于图的二倍体基因组组装方法。
Bioinformatics. 2018 Jul 1;34(13):i105-i114. doi: 10.1093/bioinformatics/bty279.
2
A haplotype-aware de novo assembly of related individuals using pedigree sequence graph.基于家系序列图的相关个体的单体型感知从头组装。
Bioinformatics. 2020 Apr 15;36(8):2385-2392. doi: 10.1093/bioinformatics/btz942.
3
HaploMerger2: rebuilding both haploid sub-assemblies from high-heterozygosity diploid genome assembly.HaploMerger2:从高杂合度二倍体基因组组装中重建两个单倍体亚组装体。
Bioinformatics. 2017 Aug 15;33(16):2577-2579. doi: 10.1093/bioinformatics/btx220.
4
phasebook: haplotype-aware de novo assembly of diploid genomes from long reads.相位图:基于长读长的二倍体基因组单体型感知从头组装
Genome Biol. 2021 Oct 27;22(1):299. doi: 10.1186/s13059-021-02512-x.
5
JTK: targeted diploid genome assembler.JTK:靶向二倍体基因组组装器。
Bioinformatics. 2023 Jul 1;39(7). doi: 10.1093/bioinformatics/btad398.
6
De novo diploid genome assembly using long noisy reads.从头组装具有长噪声读长的二倍体基因组。
Nat Commun. 2024 Apr 5;15(1):2964. doi: 10.1038/s41467-024-47349-7.
7
Coverage-Versus-Length Plots, a Simple Quality Control Step for Yeast Genome Sequence Assemblies.覆盖度与长度图:酵母基因组序列组装的一个简单质量控制步骤
G3 (Bethesda). 2019 Mar 7;9(3):879-887. doi: 10.1534/g3.118.200745.
8
MsPAC: a tool for haplotype-phased structural variant detection.MsPAC:一种用于单体型相位结构变异检测的工具。
Bioinformatics. 2020 Feb 1;36(3):922-924. doi: 10.1093/bioinformatics/btz618.
9
SpLitteR: diploid genome assembly using TELL-Seq linked-reads and assembly graphs.SpLitter:利用 TELL-Seq 连接读取和组装图进行二倍体基因组组装。
PeerJ. 2024 Sep 27;12:e18050. doi: 10.7717/peerj.18050. eCollection 2024.
10
Purge Haplotigs: allelic contig reassignment for third-gen diploid genome assemblies.清除单倍型:三代二倍体基因组组装的等位基因 contig 重新分配。
BMC Bioinformatics. 2018 Nov 29;19(1):460. doi: 10.1186/s12859-018-2485-7.

引用本文的文献

1
Graphasing: phasing diploid genome assembly graphs with single-cell strand sequencing.Graphasing:利用单细胞测序进行二倍体基因组组装图谱的相位分析。
Genome Biol. 2024 Oct 10;25(1):265. doi: 10.1186/s13059-024-03409-1.
2
Phasing Diploid Genome Assembly Graphs with Single-Cell Strand Sequencing.利用单细胞链测序对二倍体基因组组装图进行定相
bioRxiv. 2024 Jun 20:2024.02.15.580432. doi: 10.1101/2024.02.15.580432.
3
STAR+WASP reduces reference bias in the allele-specific mapping of RNA-seq reads.STAR+WASP可减少RNA测序读数等位基因特异性映射中的参考偏差。

本文引用的文献

1
Multi-platform discovery of haplotype-resolved structural variation in human genomes.多平台发现人类基因组中单体型分辨率结构变异。
Nat Commun. 2019 Apr 16;10(1):1784. doi: 10.1038/s41467-018-08148-z.
2
Superbubbles, Ultrabubbles, and Cacti.超级气泡、超气泡与仙人掌。
J Comput Biol. 2018 Jul;25(7):649-663. doi: 10.1089/cmb.2017.0251. Epub 2018 Feb 20.
3
Dense and accurate whole-chromosome haplotyping of individual genomes.个体基因组的密集且精确的全染色体单倍型分型。
bioRxiv. 2024 Feb 5:2024.01.21.576391. doi: 10.1101/2024.01.21.576391.
4
Co-linear chaining on pangenome graphs.泛基因组图谱上的共线性连锁
Algorithms Mol Biol. 2024 Jan 27;19(1):4. doi: 10.1186/s13015-024-00250-w.
5
Decoding the fibromelanosis locus complex chromosomal rearrangement of black-bone chicken: genetic differentiation, selective sweeps and protein-coding changes in Kadaknath chicken.解码乌骨鸡纤维黑素沉着基因座复杂染色体重排:卡达卡纳特鸡的遗传分化、选择性清除和蛋白质编码变化
Front Genet. 2023 Jun 22;14:1180658. doi: 10.3389/fgene.2023.1180658. eCollection 2023.
6
From contigs towards chromosomes: automatic improvement of long read assemblies (ILRA).从重叠群到染色体:长读序列组装的自动改进(ILRA)。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad248.
7
genome assembly of the partial homozygous dihaploid potato identified PVY resistance gene () derived from .部分纯合双单倍体马铃薯的基因组组装鉴定出源自[未提及具体来源]的PVY抗性基因()。
Breed Sci. 2023 Apr;73(2):168-179. doi: 10.1270/jsbbs.22078. Epub 2023 Apr 13.
8
Coverage-preserving sparsification of overlap graphs for long-read assembly.重叠图的覆盖保持稀疏化用于长读长组装。
Bioinformatics. 2023 Mar 1;39(3). doi: 10.1093/bioinformatics/btad124.
9
Genome sequence assembly algorithms and misassembly identification methods.基因组序列组装算法和错误组装识别方法。
Mol Biol Rep. 2022 Nov;49(11):11133-11148. doi: 10.1007/s11033-022-07919-8. Epub 2022 Sep 23.
10
De novo sequencing, diploid assembly, and annotation of the black carpenter ant, Camponotus pennsylvanicus, and its symbionts by one person for $1000, using nanopore sequencing.从头测序、二倍体组装和注释黑木工蚁(Camponotus pennsylvanicus)及其共生体,由一人以 1000 美元的价格使用纳米孔测序完成。
Nucleic Acids Res. 2023 Jan 11;51(1):17-28. doi: 10.1093/nar/gkac510.
Nat Commun. 2017 Nov 3;8(1):1293. doi: 10.1038/s41467-017-01389-4.
4
MECAT: fast mapping, error correction, and de novo assembly for single-molecule sequencing reads.MECAT:用于单分子测序读取的快速映射、错误纠正和从头组装。
Nat Methods. 2017 Nov;14(11):1072-1074. doi: 10.1038/nmeth.4432. Epub 2017 Sep 18.
5
Resolving multicopy duplications using polyploid phasing.使用多倍体定相解析多拷贝重复。
Res Comput Mol Biol. 2017 May;10229:117-133. doi: 10.1007/978-3-319-56970-3_8. Epub 2017 Apr 12.
6
De novo yeast genome assemblies from MinION, PacBio and MiSeq platforms.基于 MinION、PacBio 和 MiSeq 平台的从头酵母基因组组装。
Sci Rep. 2017 Jun 21;7(1):3935. doi: 10.1038/s41598-017-03996-z.
7
Contrasting evolutionary genome dynamics between domesticated and wild yeasts.驯化酵母与野生酵母之间进化基因组动态的对比
Nat Genet. 2017 Jun;49(6):913-924. doi: 10.1038/ng.3847. Epub 2017 Apr 17.
8
Direct determination of diploid genome sequences.二倍体基因组序列的直接测定。
Genome Res. 2017 May;27(5):757-767. doi: 10.1101/gr.214874.116. Epub 2017 Apr 5.
9
Canu: scalable and accurate long-read assembly via adaptive -mer weighting and repeat separation.Canu:通过自适应k-mer加权和重复序列分离实现可扩展且准确的长读长序列拼接
Genome Res. 2017 May;27(5):722-736. doi: 10.1101/gr.215087.116. Epub 2017 Mar 15.
10
Hybrid assembly of the large and highly repetitive genome of , a progenitor of bread wheat, with the MaSuRCA mega-reads algorithm.利用MaSuRCA巨读算法对面包小麦的祖先之一——[具体物种名称未给出]的大型高度重复基因组进行混合组装。
Genome Res. 2017 May;27(5):787-792. doi: 10.1101/gr.213405.116. Epub 2017 Jan 27.