Suppr超能文献

韩牛全基因组定相分析中Hi-C与10X基因组连接读长测序的比较

A Comparison between Hi-C and 10X Genomics Linked Read Sequencing for Whole Genome Phasing in Hanwoo Cattle.

作者信息

Srikanth Krishnamoorthy, Park Jong-Eun, Lim Dajeong, Cha Jihye, Cho Sang-Rae, Cho In-Cheol, Park Woncheoul

机构信息

Animal Genomics and Bioinformatics Division, National Institute of Animal Science, RDA, Wanju 55365, Korea.

Hanwoo Research Institute, National Institute of Animal Science, RDA, Pyeongchang 25340, Korea.

出版信息

Genes (Basel). 2020 Mar 20;11(3):332. doi: 10.3390/genes11030332.

Abstract

Until recently, genome-scale phasing was limited due to the short read sizes of sequence data. Though the use of long-read sequencing can overcome this limitation, they require extensive error correction. The emergence of technologies such as 10X genomics linked read sequencing and Hi-C which uses short-read sequencers along with library preparation protocols that facilitates long-read assemblies have greatly reduced the complexities of genome scale phasing. Moreover, it is possible to accurately assemble phased genome of individual samples using these methods. Therefore, in this study, we compared three phasing strategies which included two sample preparation methods along with the Long Ranger pipeline of 10X genomics and HapCut2 software, namely 10X-LG, 10X-HapCut2, and HiC-HapCut2 and assessed their performance and accuracy. We found that the 10X-LG had the best phasing performance amongst the method analyzed. They had the highest phasing rate (89.6%), longest adjusted N50 (1.24 Mb), and lowest switch error rate (0.07%). Moreover, the phasing accuracy and yield of the 10X-LG stayed over 90% for distances up to 4 Mb and 550 Kb respectively, which were considerably higher than 10X-HapCut2 and Hi-C Hapcut2. The results of this study will serve as a good reference for future benchmarking studies and also for reference-based imputation in Hanwoo.

摘要

直到最近,由于序列数据的短读长,全基因组定相仍受到限制。虽然使用长读长测序可以克服这一限制,但它们需要大量的错误校正。诸如10X基因组学链接读长测序和Hi-C等技术的出现,这些技术使用短读长测序仪以及有助于长读长组装的文库制备方案,极大地降低了全基因组定相的复杂性。此外,使用这些方法可以准确地组装单个样本的定相基因组。因此,在本研究中,我们比较了三种定相策略,包括两种样本制备方法以及10X基因组学的Long Ranger流程和HapCut2软件,即10X-LG、10X-HapCut2和HiC-HapCut2,并评估了它们的性能和准确性。我们发现,在分析的方法中,10X-LG具有最佳的定相性能。它们具有最高的定相率(89.6%)、最长的调整后N50(1.24 Mb)和最低的切换错误率(0.07%)。此外,10X-LG的定相准确性和产量在距离分别达到4 Mb和550 Kb时分别保持在90%以上,这明显高于10X-HapCut2和Hi-C Hapcut2。本研究结果将为未来的基准研究以及韩牛基于参考的插补提供良好的参考。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/336d/7140831/bb767b4ecaa3/genes-11-00332-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验