Suppr超能文献

基于图的基因组比对和基因分型与 HISAT2 和 HISAT-genotype。

Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype.

机构信息

Lyda Hill Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX, USA.

Department of Computer Science, Stanford University, Stanford, CA, USA.

出版信息

Nat Biotechnol. 2019 Aug;37(8):907-915. doi: 10.1038/s41587-019-0201-4. Epub 2019 Aug 2.

Abstract

The human reference genome represents only a small number of individuals, which limits its usefulness for genotyping. We present a method named HISAT2 (hierarchical indexing for spliced alignment of transcripts 2) that can align both DNA and RNA sequences using a graph Ferragina Manzini index. We use HISAT2 to represent and search an expanded model of the human reference genome in which over 14.5 million genomic variants in combination with haplotypes are incorporated into the data structure used for searching and alignment. We benchmark HISAT2 using simulated and real datasets to demonstrate that our strategy of representing a population of genomes, together with a fast, memory-efficient search algorithm, provides more detailed and accurate variant analyses than other methods. We apply HISAT2 for HLA typing and DNA fingerprinting; both applications form part of the HISAT-genotype software that enables analysis of haplotype-resolved genes or genomic regions. HISAT-genotype outperforms other computational methods and matches or exceeds the performance of laboratory-based assays.

摘要

人类参考基因组仅代表少数个体,这限制了其用于基因分型的用途。我们提出了一种名为 HISAT2(用于转录本拼接对齐的分层索引 2)的方法,该方法可以使用 Ferragina Manzini 图索引同时对齐 DNA 和 RNA 序列。我们使用 HISAT2 来表示和搜索扩展的人类参考基因组模型,其中包含超过 1450 万个基因组变体,以及与单倍型结合的变体,这些变体被合并到用于搜索和对齐的数

据结构中。我们使用模拟和真实数据集对 HISAT2 进行基准测试,以证明我们代表基因组群体的策略,以及快速、内存高效的搜索算法,比其他方法提供更详细和准确的变体分析。我们将 HISAT2 应用于 HLA 分型和 DNA 指纹分析;这两个应用程序都是 HISAT-genotype 软件的一部分,该软件能够分析单倍型解析基因或基因组区域。HISAT-genotype 优于其他计算方法,并且与基于实验室的检测方法的性能相匹配或超过。

相似文献

2
6
Fast read alignment with incorporation of known genomic variants.快速读取与已知基因组变异的整合。
BMC Med Inform Decis Mak. 2019 Dec 19;19(Suppl 6):265. doi: 10.1186/s12911-019-0960-3.
7
DNA sequences alignment method using sparse index on pan-genome graph.基于泛基因组图的稀疏索引的 DNA 序列比对方法。
J Bioinform Comput Biol. 2024 Aug;22(4):2450019. doi: 10.1142/S0219720024500197. Epub 2024 Aug 31.

引用本文的文献

本文引用的文献

1
Fast and accurate genomic analyses using genome graphs.利用基因组图谱进行快速准确的基因组分析。
Nat Genet. 2019 Feb;51(2):354-362. doi: 10.1038/s41588-018-0316-4. Epub 2019 Jan 14.
5
Centrifuge: rapid and sensitive classification of metagenomic sequences.离心机:宏基因组序列的快速灵敏分类
Genome Res. 2016 Dec;26(12):1721-1729. doi: 10.1101/gr.210641.116. Epub 2016 Oct 17.
8
Indexing Graphs for Path Queries with Applications in Genome Research.用于路径查询的图索引及其在基因组研究中的应用
IEEE/ACM Trans Comput Biol Bioinform. 2014 Mar-Apr;11(2):375-88. doi: 10.1109/TCBB.2013.2297101.
9
Excess of rare, inherited truncating mutations in autism.自闭症中罕见的遗传性截短突变过多。
Nat Genet. 2015 Jun;47(6):582-8. doi: 10.1038/ng.3303. Epub 2015 May 11.
10

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验