• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于图的基因组比对和基因分型与 HISAT2 和 HISAT-genotype。

Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype.

机构信息

Lyda Hill Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX, USA.

Department of Computer Science, Stanford University, Stanford, CA, USA.

出版信息

Nat Biotechnol. 2019 Aug;37(8):907-915. doi: 10.1038/s41587-019-0201-4. Epub 2019 Aug 2.

DOI:10.1038/s41587-019-0201-4
PMID:31375807
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7605509/
Abstract

The human reference genome represents only a small number of individuals, which limits its usefulness for genotyping. We present a method named HISAT2 (hierarchical indexing for spliced alignment of transcripts 2) that can align both DNA and RNA sequences using a graph Ferragina Manzini index. We use HISAT2 to represent and search an expanded model of the human reference genome in which over 14.5 million genomic variants in combination with haplotypes are incorporated into the data structure used for searching and alignment. We benchmark HISAT2 using simulated and real datasets to demonstrate that our strategy of representing a population of genomes, together with a fast, memory-efficient search algorithm, provides more detailed and accurate variant analyses than other methods. We apply HISAT2 for HLA typing and DNA fingerprinting; both applications form part of the HISAT-genotype software that enables analysis of haplotype-resolved genes or genomic regions. HISAT-genotype outperforms other computational methods and matches or exceeds the performance of laboratory-based assays.

摘要

人类参考基因组仅代表少数个体,这限制了其用于基因分型的用途。我们提出了一种名为 HISAT2(用于转录本拼接对齐的分层索引 2)的方法,该方法可以使用 Ferragina Manzini 图索引同时对齐 DNA 和 RNA 序列。我们使用 HISAT2 来表示和搜索扩展的人类参考基因组模型,其中包含超过 1450 万个基因组变体,以及与单倍型结合的变体,这些变体被合并到用于搜索和对齐的数

据结构中。我们使用模拟和真实数据集对 HISAT2 进行基准测试,以证明我们代表基因组群体的策略,以及快速、内存高效的搜索算法,比其他方法提供更详细和准确的变体分析。我们将 HISAT2 应用于 HLA 分型和 DNA 指纹分析;这两个应用程序都是 HISAT-genotype 软件的一部分,该软件能够分析单倍型解析基因或基因组区域。HISAT-genotype 优于其他计算方法,并且与基于实验室的检测方法的性能相匹配或超过。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7605509/6fd3f586452b/nihms-1635086-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7605509/e3a42590f868/nihms-1635086-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7605509/6d43f7ee7f58/nihms-1635086-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7605509/8a4f5d0ef98c/nihms-1635086-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7605509/cf0193eb6973/nihms-1635086-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7605509/b93064227bfc/nihms-1635086-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7605509/6fd3f586452b/nihms-1635086-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7605509/e3a42590f868/nihms-1635086-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7605509/6d43f7ee7f58/nihms-1635086-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7605509/8a4f5d0ef98c/nihms-1635086-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7605509/cf0193eb6973/nihms-1635086-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7605509/b93064227bfc/nihms-1635086-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7605509/6fd3f586452b/nihms-1635086-f0006.jpg

相似文献

1
Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype.基于图的基因组比对和基因分型与 HISAT2 和 HISAT-genotype。
Nat Biotechnol. 2019 Aug;37(8):907-915. doi: 10.1038/s41587-019-0201-4. Epub 2019 Aug 2.
2
HISAT: a fast spliced aligner with low memory requirements.HISAT:一种内存需求低的快速剪接比对器。
Nat Methods. 2015 Apr;12(4):357-60. doi: 10.1038/nmeth.3317. Epub 2015 Mar 9.
3
Graphtyper enables population-scale genotyping using pangenome graphs.Graphtyper 可使用泛基因组图谱进行人群规模的基因分型。
Nat Genet. 2017 Nov;49(11):1654-1660. doi: 10.1038/ng.3964. Epub 2017 Sep 25.
4
A space and time-efficient index for the compacted colored de Bruijn graph.一种用于压缩彩色 de Bruijn 图的空间和时间高效索引。
Bioinformatics. 2018 Jul 1;34(13):i169-i177. doi: 10.1093/bioinformatics/bty292.
5
A comprehensive benchmark of graph-based genetic variant genotyping algorithms on plant genomes for creating an accurate ensemble pipeline.基于图的遗传变异基因分型算法在植物基因组上的综合基准测试,用于创建一个准确的综合管道。
Genome Biol. 2024 Apr 8;25(1):91. doi: 10.1186/s13059-024-03239-1.
6
Fast read alignment with incorporation of known genomic variants.快速读取与已知基因组变异的整合。
BMC Med Inform Decis Mak. 2019 Dec 19;19(Suppl 6):265. doi: 10.1186/s12911-019-0960-3.
7
DNA sequences alignment method using sparse index on pan-genome graph.基于泛基因组图的稀疏索引的 DNA 序列比对方法。
J Bioinform Comput Biol. 2024 Aug;22(4):2450019. doi: 10.1142/S0219720024500197. Epub 2024 Aug 31.
8
Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown.基于 HISAT、StringTie 和 Ballgown 的 RNA-seq 实验的转录本水平表达分析。
Nat Protoc. 2016 Sep;11(9):1650-67. doi: 10.1038/nprot.2016.095. Epub 2016 Aug 11.
9
A high-quality human reference panel reveals the complexity and distribution of genomic structural variants.高质量的人类参考面板揭示了基因组结构变异的复杂性和分布。
Nat Commun. 2016 Oct 6;7:12989. doi: 10.1038/ncomms12989.
10
Calling known variants and identifying new variants while rapidly aligning sequence data.在快速对齐序列数据的同时,调用已知变异体并识别新变异体。
J Dairy Sci. 2019 Apr;102(4):3216-3229. doi: 10.3168/jds.2018-15172. Epub 2019 Feb 14.

引用本文的文献

1
DNA G-quadruplex profiling in skeletal muscle stem cells reveals functional and mechanistic insights.骨骼肌干细胞中的DNA G-四链体分析揭示了功能和机制方面的见解。
Genome Biol. 2025 Sep 5;26(1):269. doi: 10.1186/s13059-025-03753-w.
2
Transcriptomic Analysis of Peripheral Blood Mononuclear Cells During Infection in Cattle Highlights a Generalized Host Immune Reaction.牛感染期间外周血单个核细胞的转录组分析突出了全身性宿主免疫反应。
Biology (Basel). 2025 Aug 12;14(8):1034. doi: 10.3390/biology14081034.
3
Deciphering the Gene Expression and Alternative Splicing Basis of Muscle Development Through Interpretable Machine Learning Models.

本文引用的文献

1
Fast and accurate genomic analyses using genome graphs.利用基因组图谱进行快速准确的基因组分析。
Nat Genet. 2019 Feb;51(2):354-362. doi: 10.1038/s41588-018-0316-4. Epub 2019 Jan 14.
2
Variation graph toolkit improves read mapping by representing genetic variation in the reference.变异图谱工具包通过表示参考中的遗传变异来提高读映射质量。
Nat Biotechnol. 2018 Oct;36(9):875-879. doi: 10.1038/nbt.4227. Epub 2018 Aug 20.
3
Kourami: graph-guided assembly for novel human leukocyte antigen allele discovery.Kourami:用于新型人类白细胞抗原等位基因发现的图形引导组装。
通过可解释的机器学习模型解析肌肉发育的基因表达和可变剪接基础
Biology (Basel). 2025 Aug 15;14(8):1059. doi: 10.3390/biology14081059.
4
A High-Quality Chromosome-Level Genome Assembly and Comparative Analyses Provide Insights into the Adaptation of (Fabricius, 1794) (Diptera: Calliphoridae).高质量的染色体水平基因组组装及比较分析为红头丽蝇(法布里丘斯,1794年)(双翅目:丽蝇科)的适应性研究提供了见解。
Biology (Basel). 2025 Jul 22;14(8):913. doi: 10.3390/biology14080913.
5
TDP-43 dysregulation impairs cholesterol metabolism linked with myelination defects.TDP-43失调会损害与髓鞘形成缺陷相关的胆固醇代谢。
Acta Neuropathol. 2025 Sep 4;150(1):23. doi: 10.1007/s00401-025-02927-x.
6
Duplicated Genes on Homologous Chromosomes Decipher the Dominant Epistasis of the Fiberless Mutant in Cotton.同源染色体上的重复基因解析棉花无纤维突变体的显性上位性
Biology (Basel). 2025 Aug 2;14(8):983. doi: 10.3390/biology14080983.
7
The Allele is a Missense Mutation Within the Predicted Tryptophan 2,3-dioxygenase Protein Domain of .该等位基因是在预测的色氨酸2,3-双加氧酶蛋白结构域内的一个错义突变。
MicroPubl Biol. 2025 Aug 18;2025. doi: 10.17912/micropub.biology.001760. eCollection 2025.
8
Multifactor transcriptional profiling of potato during 2,4-D-induced resistance to common scab disease.2,4-D诱导马铃薯对疮痂病产生抗性过程中的多因素转录谱分析
Front Plant Sci. 2025 Aug 18;16:1641317. doi: 10.3389/fpls.2025.1641317. eCollection 2025.
9
Role of Serotonin on Gene Expression and Physiology in Human Cytotrophoblasts and Placenta.血清素在人细胞滋养层细胞和胎盘中对基因表达及生理功能的作用
Endocrinology. 2025 Jul 8;166(9). doi: 10.1210/endocr/bqaf124.
10
Pear scab resistance gene Rvn1 from Ussurian pear is located in a cluster of receptor-like protein ethylene-inducing Xylanase (EIX) genes.来自秋子梨的梨黑星病抗性基因Rvn1位于类受体蛋白乙烯诱导木聚糖酶(EIX)基因簇中。
BMC Plant Biol. 2025 Sep 2;25(1):1191. doi: 10.1186/s12870-025-07209-y.
Genome Biol. 2018 Feb 7;19(1):16. doi: 10.1186/s13059-018-1388-2.
4
A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree.通过对一个包含17名成员的三代家系进行测序,经遗传继承验证的540万个定相人类变异的参考数据集。
Genome Res. 2017 Jan;27(1):157-164. doi: 10.1101/gr.210500.116. Epub 2016 Nov 30.
5
Centrifuge: rapid and sensitive classification of metagenomic sequences.离心机:宏基因组序列的快速灵敏分类
Genome Res. 2016 Dec;26(12):1721-1729. doi: 10.1101/gr.210641.116. Epub 2016 Oct 17.
6
Challenges and disparities in the application of personalized genomic medicine to populations with African ancestry.应用个性化基因组医学于非裔人群所面临的挑战和差异。
Nat Commun. 2016 Oct 11;7:12521. doi: 10.1038/ncomms12521.
7
Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences.Minimap和miniasm:用于有噪声长序列的快速映射和从头组装。
Bioinformatics. 2016 Jul 15;32(14):2103-10. doi: 10.1093/bioinformatics/btw152. Epub 2016 Mar 19.
8
Indexing Graphs for Path Queries with Applications in Genome Research.用于路径查询的图索引及其在基因组研究中的应用
IEEE/ACM Trans Comput Biol Bioinform. 2014 Mar-Apr;11(2):375-88. doi: 10.1109/TCBB.2013.2297101.
9
Excess of rare, inherited truncating mutations in autism.自闭症中罕见的遗传性截短突变过多。
Nat Genet. 2015 Jun;47(6):582-8. doi: 10.1038/ng.3303. Epub 2015 May 11.
10
HISAT: a fast spliced aligner with low memory requirements.HISAT:一种内存需求低的快速剪接比对器。
Nat Methods. 2015 Apr;12(4):357-60. doi: 10.1038/nmeth.3317. Epub 2015 Mar 9.