• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

HapTree:一种使用二代测序数据进行单一个体多基因型分型的新型贝叶斯框架。

HapTree: a novel Bayesian framework for single individual polyplotyping using NGS data.

作者信息

Berger Emily, Yorukoglu Deniz, Peng Jian, Berger Bonnie

机构信息

Department of Mathematics, MIT, Cambridge, Massachusetts, United States of America; Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, Massachusetts, United States of America; Department of Mathematics, UC Berkeley, Berkeley, California, United States of America.

Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, Massachusetts, United States of America.

出版信息

PLoS Comput Biol. 2014 Mar 27;10(3):e1003502. doi: 10.1371/journal.pcbi.1003502. eCollection 2014 Mar.

DOI:10.1371/journal.pcbi.1003502
PMID:24675685
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3967924/
Abstract

As the more recent next-generation sequencing (NGS) technologies provide longer read sequences, the use of sequencing datasets for complete haplotype phasing is fast becoming a reality, allowing haplotype reconstruction of a single sequenced genome. Nearly all previous haplotype reconstruction studies have focused on diploid genomes and are rarely scalable to genomes with higher ploidy. Yet computational investigations into polyploid genomes carry great importance, impacting plant, yeast and fish genomics, as well as the studies of the evolution of modern-day eukaryotes and (epi)genetic interactions between copies of genes. In this paper, we describe a novel maximum-likelihood estimation framework, HapTree, for polyploid haplotype assembly of an individual genome using NGS read datasets. We evaluate the performance of HapTree on simulated polyploid sequencing read data modeled after Illumina sequencing technologies. For triploid and higher ploidy genomes, we demonstrate that HapTree substantially improves haplotype assembly accuracy and efficiency over the state-of-the-art; moreover, HapTree is the first scalable polyplotyping method for higher ploidy. As a proof of concept, we also test our method on real sequencing data from NA12878 (1000 Genomes Project) and evaluate the quality of assembled haplotypes with respect to trio-based diplotype annotation as the ground truth. The results indicate that HapTree significantly improves the switch accuracy within phased haplotype blocks as compared to existing haplotype assembly methods, while producing comparable minimum error correction (MEC) values. A summary of this paper appears in the proceedings of the RECOMB 2014 conference, April 2-5.

摘要

随着更新的下一代测序(NGS)技术能够提供更长的读段序列,利用测序数据集进行完整单倍型定相正迅速成为现实,这使得对单个测序基因组进行单倍型重建成为可能。几乎所有之前的单倍型重建研究都集中在二倍体基因组上,很少能扩展到更高倍性的基因组。然而,对多倍体基因组的计算研究具有重要意义,它影响着植物、酵母和鱼类基因组学,以及现代真核生物进化和基因拷贝之间的(表观)遗传相互作用的研究。在本文中,我们描述了一种新颖的最大似然估计框架HapTree,用于使用NGS读段数据集对单个基因组进行多倍体单倍型组装。我们在模拟的、以Illumina测序技术为模型的多倍体测序读段数据上评估了HapTree的性能。对于三倍体及更高倍性的基因组,我们证明HapTree在单倍型组装准确性和效率方面比现有技术有显著提高;此外,HapTree是第一种可扩展的用于更高倍性的多倍体分型方法。作为概念验证,我们还在来自NA12878(千人基因组计划)的真实测序数据上测试了我们的方法,并以基于三联体的双倍型注释作为基准来评估组装单倍型的质量。结果表明,与现有的单倍型组装方法相比,HapTree显著提高了定相单倍型块内的切换准确性,同时产生了相当的最小错误校正(MEC)值。本文的摘要发表在2014年4月2 - 5日的RECOMB会议论文集上。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d3d/3967924/58caac646ab3/pcbi.1003502.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d3d/3967924/33fac386bd88/pcbi.1003502.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d3d/3967924/b72a71097613/pcbi.1003502.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d3d/3967924/4cd805ddf548/pcbi.1003502.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d3d/3967924/7d28cd3cf8dc/pcbi.1003502.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d3d/3967924/997bf8ecea80/pcbi.1003502.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d3d/3967924/58caac646ab3/pcbi.1003502.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d3d/3967924/33fac386bd88/pcbi.1003502.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d3d/3967924/b72a71097613/pcbi.1003502.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d3d/3967924/4cd805ddf548/pcbi.1003502.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d3d/3967924/7d28cd3cf8dc/pcbi.1003502.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d3d/3967924/997bf8ecea80/pcbi.1003502.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d3d/3967924/58caac646ab3/pcbi.1003502.g006.jpg

相似文献

1
HapTree: a novel Bayesian framework for single individual polyplotyping using NGS data.HapTree:一种使用二代测序数据进行单一个体多基因型分型的新型贝叶斯框架。
PLoS Comput Biol. 2014 Mar 27;10(3):e1003502. doi: 10.1371/journal.pcbi.1003502. eCollection 2014 Mar.
2
Integrating read-based and population-based phasing for dense and accurate haplotyping of individual genomes.基于读取和基于群体的相位整合,实现个体基因组的密集和精确单倍型分型。
Bioinformatics. 2019 Jul 15;35(14):i242-i248. doi: 10.1093/bioinformatics/btz329.
3
Leveraging reads that span multiple single nucleotide polymorphisms for haplotype inference from sequencing data.利用跨越多个单核苷酸多态性的读取信息,从测序数据中推断单倍型。
Bioinformatics. 2013 Sep 15;29(18):2245-52. doi: 10.1093/bioinformatics/btt386. Epub 2013 Jul 3.
4
GenHap: a novel computational method based on genetic algorithms for haplotype assembly.GenHap:一种基于遗传算法的新型单倍型组装计算方法。
BMC Bioinformatics. 2019 Apr 18;20(Suppl 4):172. doi: 10.1186/s12859-019-2691-y.
5
Exploiting next-generation sequencing to solve the haplotyping puzzle in polyploids: a simulation study.利用下一代测序技术解决多倍体中的单体型分析难题:一项模拟研究。
Brief Bioinform. 2018 May 1;19(3):387-403. doi: 10.1093/bib/bbw126.
6
Assembly and diploid architecture of an individual human genome via single-molecule technologies.通过单分子技术构建单个人类基因组的组装与二倍体结构
Nat Methods. 2015 Aug;12(8):780-6. doi: 10.1038/nmeth.3454. Epub 2015 Jun 29.
7
flopp: Extremely Fast Long-Read Polyploid Haplotype Phasing by Uniform Tree Partitioning.flopp:通过均匀树分区实现超快速长读多倍体单体型相位。
J Comput Biol. 2022 Feb;29(2):195-211. doi: 10.1089/cmb.2021.0436. Epub 2022 Jan 17.
8
Improved haplotype inference by exploiting long-range linking and allelic imbalance in RNA-seq datasets.通过利用RNA测序数据集的长程连接和等位基因不平衡改进单倍型推断
Nat Commun. 2020 Sep 16;11(1):4662. doi: 10.1038/s41467-020-18320-z.
9
A fast and accurate enumeration-based algorithm for haplotyping a triploid individual.一种用于对三倍体个体进行单倍型分型的基于计数的快速准确算法。
Algorithms Mol Biol. 2018 Jun 1;13:10. doi: 10.1186/s13015-018-0129-0. eCollection 2018.
10
HapCompass: a fast cycle basis algorithm for accurate haplotype assembly of sequence data.HapCompass:一种用于准确组装序列数据单倍型的快速循环基算法。
J Comput Biol. 2012 Jun;19(6):577-90. doi: 10.1089/cmb.2012.0084.

引用本文的文献

1
ralphi: a deep reinforcement learning framework for haplotype assembly.拉尔菲:一种用于单倍型组装的深度强化学习框架。
bioRxiv. 2025 Feb 21:2025.02.17.638151. doi: 10.1101/2025.02.17.638151.
2
DeepHapNet: a haplotype assembly method based on RetNet and deep spectral clustering.深度单倍型网络:一种基于RetNet和深度谱聚类的单倍型组装方法。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae656.
3
GCphase: an SNP phasing method using a graph partition and error correction algorithm.GC 相:一种使用图划分和错误纠正算法的 SNP 相位方法。

本文引用的文献

1
Haplotype assembly in polyploid genomes and identical by descent shared tracts.多倍体基因组中的单体型组装和同源共享片段。
Bioinformatics. 2013 Jul 1;29(13):i352-60. doi: 10.1093/bioinformatics/btt213.
2
HapCompass: a fast cycle basis algorithm for accurate haplotype assembly of sequence data.HapCompass:一种用于准确组装序列数据单倍型的快速循环基算法。
J Comput Biol. 2012 Jun;19(6):577-90. doi: 10.1089/cmb.2012.0084.
3
Haplotype reconstruction using perfect phylogeny and sequence data.基于完美系统发育和序列数据的单体型重构。
BMC Bioinformatics. 2024 Aug 19;25(1):267. doi: 10.1186/s12859-024-05901-8.
4
Haplotype-resolved assembly of a tetraploid potato genome using long reads and low-depth offspring data.利用长读长和低深度后代数据进行四倍体马铃薯基因组的单倍型解析组装。
Genome Biol. 2024 Jan 19;25(1):26. doi: 10.1186/s13059-023-03160-z.
5
XHap: haplotype assembly using long-distance read correlations learned by transformers.XHap:利用通过变压器学习的长距离读段相关性进行单倍型组装。
Bioinform Adv. 2023 Nov 23;3(1):vbad169. doi: 10.1093/bioadv/vbad169. eCollection 2023.
6
Pairwise comparative analysis of six haplotype assembly methods based on users' experience.基于用户体验的六种单倍型组装方法的两两比较分析。
BMC Genom Data. 2023 Jun 29;24(1):35. doi: 10.1186/s12863-023-01134-5.
7
Smooth Descent: A ploidy-aware algorithm to improve linkage mapping in the presence of genotyping errors.平滑下降法:一种在存在基因分型错误的情况下改进连锁图谱构建的倍性感知算法。
Front Genet. 2023 Mar 1;14:1049988. doi: 10.3389/fgene.2023.1049988. eCollection 2023.
8
Phylogenetic Analysis of Allotetraploid Species Using Polarized Genomic Sequences.利用极化基因组序列进行异源四倍体物种的系统发育分析。
Syst Biol. 2023 Jun 16;72(2):372-390. doi: 10.1093/sysbio/syad009.
9
HaploMaker: An improved algorithm for rapid haplotype assembly of genomic sequences.HaploMaker:一种用于快速组装基因组序列单倍型的改进算法。
Gigascience. 2022 May 17;11. doi: 10.1093/gigascience/giac038.
10
Multiallelic models for QTL mapping in diverse polyploid populations.多等位基因模型在不同多倍体群体中的 QTL 定位。
BMC Bioinformatics. 2022 Feb 14;23(1):67. doi: 10.1186/s12859-022-04607-z.
BMC Bioinformatics. 2012 Apr 19;13 Suppl 6(Suppl 6):S3. doi: 10.1186/1471-2105-13-S6-S3.
4
Rapid haplotype inference for nuclear families.快速核型推断的家族。
Genome Biol. 2010;11(10):R108. doi: 10.1186/gb-2010-11-10-r108. Epub 2010 Oct 29.
5
A map of human genome variation from population-scale sequencing.人类基因组变异的图谱来自于基于人群的测序。
Nature. 2010 Oct 28;467(7319):1061-73. doi: 10.1038/nature09534.
6
A comparison of several algorithms for the single individual SNP haplotyping reconstruction problem.几种算法在单个体 SNP 单体型重构问题上的比较。
Bioinformatics. 2010 Sep 15;26(18):2217-25. doi: 10.1093/bioinformatics/btq411. Epub 2010 Jul 11.
7
Optimal algorithms for haplotype assembly from whole-genome sequence data.从全基因组序列数据中进行单倍型组装的最优算法。
Bioinformatics. 2010 Jun 15;26(12):i183-90. doi: 10.1093/bioinformatics/btq215.
8
High-resolution detection of identity by descent in unrelated individuals.高分辨率检测无关个体间的血缘关系。
Am J Hum Genet. 2010 Apr 9;86(4):526-39. doi: 10.1016/j.ajhg.2010.02.021. Epub 2010 Mar 18.
9
A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals.针对三联体和无关个体的大型数据集进行基因型填充和单倍型相位推断的统一方法。
Am J Hum Genet. 2009 Feb;84(2):210-23. doi: 10.1016/j.ajhg.2009.01.005. Epub 2009 Feb 5.
10
Shape-IT: new rapid and accurate algorithm for haplotype inference.Shape-IT:用于单倍型推断的新型快速准确算法。
BMC Bioinformatics. 2008 Dec 16;9:540. doi: 10.1186/1471-2105-9-540.