• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

参考文献流向:利用多个群体基因组减少参考文献偏差。

Reference flow: reducing reference bias using multiple population genomes.

机构信息

Department of Computer Science, Johns Hopkins University, Baltimore, USA.

出版信息

Genome Biol. 2021 Jan 4;22(1):8. doi: 10.1186/s13059-020-02229-3.

DOI:10.1186/s13059-020-02229-3
PMID:33397413
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7780692/
Abstract

Most sequencing data analyses start by aligning sequencing reads to a linear reference genome, but failure to account for genetic variation leads to reference bias and confounding of results downstream. Other approaches replace the linear reference with structures like graphs that can include genetic variation, incurring major computational overhead. We propose the reference flow alignment method that uses multiple population reference genomes to improve alignment accuracy and reduce reference bias. Compared to the graph aligner vg, reference flow achieves a similar level of accuracy and bias avoidance but with 14% of the memory footprint and 5.5 times the speed.

摘要

大多数测序数据分析都是从将测序reads 与线性参考基因组比对开始的,但如果没有考虑遗传变异,就会导致参考偏差和下游结果的混淆。其他方法则用图等结构替代线性参考,这些结构可以包括遗传变异,但会带来巨大的计算开销。我们提出了参考流比对方法,该方法使用多个群体参考基因组来提高比对准确性并减少参考偏差。与图比对工具 vg 相比,参考流实现了相似的准确性和偏差避免水平,但内存占用仅为其 14%,速度则快了 5.5 倍。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03f7/7780692/523d94b694fc/13059_2020_2229_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03f7/7780692/1050f4599421/13059_2020_2229_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03f7/7780692/64b11b201f0e/13059_2020_2229_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03f7/7780692/da409a62cc79/13059_2020_2229_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03f7/7780692/523d94b694fc/13059_2020_2229_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03f7/7780692/1050f4599421/13059_2020_2229_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03f7/7780692/64b11b201f0e/13059_2020_2229_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03f7/7780692/da409a62cc79/13059_2020_2229_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03f7/7780692/523d94b694fc/13059_2020_2229_Fig4_HTML.jpg

相似文献

1
Reference flow: reducing reference bias using multiple population genomes.参考文献流向:利用多个群体基因组减少参考文献偏差。
Genome Biol. 2021 Jan 4;22(1):8. doi: 10.1186/s13059-020-02229-3.
2
Fast and SNP-aware short read alignment with SALT.基于 SALT 的快速 SNP 感知短读序列比对。
BMC Bioinformatics. 2021 Aug 25;22(Suppl 9):172. doi: 10.1186/s12859-021-04088-6.
3
Removing reference bias and improving indel calling in ancient DNA data analysis by mapping to a sequence variation graph.通过映射到序列变异图来去除古 DNA 数据分析中的参考偏倚并提高插入缺失调用。
Genome Biol. 2020 Sep 17;21(1):250. doi: 10.1186/s13059-020-02160-7.
4
DNA sequences alignment method using sparse index on pan-genome graph.基于泛基因组图的稀疏索引的 DNA 序列比对方法。
J Bioinform Comput Biol. 2024 Aug;22(4):2450019. doi: 10.1142/S0219720024500197. Epub 2024 Aug 31.
5
Fast alignment of reads to a variation graph with application to SNP detection.快速将读取内容与变异图谱对齐,应用于 SNP 检测。
J Integr Bioinform. 2021 Nov 16;18(4):20210032. doi: 10.1515/jib-2021-0032.
6
Meta-aligner: long-read alignment based on genome statistics.Meta比对器:基于基因组统计信息的长读段比对。
BMC Bioinformatics. 2017 Feb 23;18(1):126. doi: 10.1186/s12859-017-1518-y.
7
Aligner optimization increases accuracy and decreases compute times in multi-species sequence data.调整校正器可提高多物种序列数据的准确性并减少计算时间。
Microb Genom. 2017 Jul 8;3(9):e000122. doi: 10.1099/mgen.0.000122. eCollection 2017 Sep.
8
Fast and accurate genomic analyses using genome graphs.利用基因组图谱进行快速准确的基因组分析。
Nat Genet. 2019 Feb;51(2):354-362. doi: 10.1038/s41588-018-0316-4. Epub 2019 Jan 14.
9
Variation graph toolkit improves read mapping by representing genetic variation in the reference.变异图谱工具包通过表示参考中的遗传变异来提高读映射质量。
Nat Biotechnol. 2018 Oct;36(9):875-879. doi: 10.1038/nbt.4227. Epub 2018 Aug 20.
10
Calling known variants and identifying new variants while rapidly aligning sequence data.在快速对齐序列数据的同时,调用已知变异体并识别新变异体。
J Dairy Sci. 2019 Apr;102(4):3216-3229. doi: 10.3168/jds.2018-15172. Epub 2019 Feb 14.

引用本文的文献

1
Genetic Signatures of Competitive Performance in Burmese Gamecocks: A Transcriptomic Analysis.缅甸斗鸡竞技性能的遗传特征:一项转录组分析
Biology (Basel). 2025 Aug 16;14(8):1066. doi: 10.3390/biology14081066.
2
A practical guide to identifying associations between tandem repeats and complex human traits using consensus genotypes from multiple tools.利用多种工具的一致性基因型识别串联重复序列与复杂人类性状之间关联的实用指南。
Nat Protoc. 2025 Sep 1. doi: 10.1038/s41596-025-01231-y.
3
Long-Read Sequencing and Structural Variant Detection: Unlocking the Hidden Genome in Rare Genetic Disorders.

本文引用的文献

1
The design and construction of reference pangenome graphs with minigraph.使用 Minigraph 设计和构建参考泛基因组图谱。
Genome Biol. 2020 Oct 16;21(1):265. doi: 10.1186/s13059-020-02168-z.
2
GraphAligner: rapid and versatile sequence-to-graph alignment.GraphAligner:快速且通用的序列到图的对齐方法。
Genome Biol. 2020 Sep 24;21(1):253. doi: 10.1186/s13059-020-02157-2.
3
Removing reference bias and improving indel calling in ancient DNA data analysis by mapping to a sequence variation graph.通过映射到序列变异图来去除古 DNA 数据分析中的参考偏倚并提高插入缺失调用。
长读长测序与结构变异检测:揭示罕见遗传病中的隐藏基因组
Diagnostics (Basel). 2025 Jul 17;15(14):1803. doi: 10.3390/diagnostics15141803.
4
Exploiting uniqueness: seed-chain-extend alignment on elastic founder graphs.利用独特性:弹性奠基者图上的种子-链-扩展比对
Bioinformatics. 2025 Jul 1;41(Supplement_1):i265-i274. doi: 10.1093/bioinformatics/btaf225.
5
A survey of sequence-to-graph mapping algorithms in the pangenome era.泛基因组时代序列到图谱映射算法综述。
Genome Biol. 2025 May 22;26(1):138. doi: 10.1186/s13059-025-03606-6.
6
The impact of ancestral, genetic, and environmental influences on germline de novo mutation rates and spectra.祖先、遗传和环境因素对生殖系新生突变率及谱的影响。
Nat Commun. 2025 May 15;16(1):4527. doi: 10.1038/s41467-025-59750-x.
7
Methodological opportunities in genomic data analysis to advance health equity.基因组数据分析中促进健康公平的方法学机遇。
Nat Rev Genet. 2025 May 15. doi: 10.1038/s41576-025-00839-w.
8
Pangenome graph mitigates heterozygosity overestimation from mapping bias: a case study in Chinese indigenous pigs.泛基因组图谱减轻了因映射偏差导致的杂合度高估:以中国本土猪为例的研究
BMC Biol. 2025 Mar 26;23(1):89. doi: 10.1186/s12915-025-02194-y.
9
K-mer-based Approaches to Bridging Pangenomics and Population Genetics.基于K-mer的泛基因组学与群体遗传学关联方法。
Mol Biol Evol. 2025 Mar 5;42(3). doi: 10.1093/molbev/msaf047.
10
SVLearn: a dual-reference machine learning approach enables accurate cross-species genotyping of structural variants.SVLearn:一种双参考机器学习方法可实现结构变异的准确跨物种基因分型。
Nat Commun. 2025 Mar 11;16(1):2406. doi: 10.1038/s41467-025-57756-z.
Genome Biol. 2020 Sep 17;21(1):250. doi: 10.1186/s13059-020-02160-7.
4
Convolutional Embedded Networks for Population Scale Clustering and Bio-Ancestry Inferencing.卷积嵌入网络在群体规模聚类和生物亲缘推断中的应用。
IEEE/ACM Trans Comput Biol Bioinform. 2022 Jan-Feb;19(1):369-382. doi: 10.1109/TCBB.2020.2994649. Epub 2022 Feb 3.
5
Bovine breed-specific augmented reference graphs facilitate accurate sequence read mapping and unbiased variant discovery.牛种特异性增强参考图谱有助于准确的序列读取映射和无偏的变异发现。
Genome Biol. 2020 Jul 27;21(1):184. doi: 10.1186/s13059-020-02105-0.
6
The mutational constraint spectrum quantified from variation in 141,456 humans.从 141456 名人类个体的变异中量化的突变约束谱。
Nature. 2020 May;581(7809):434-443. doi: 10.1038/s41586-020-2308-7. Epub 2020 May 27.
7
Personalized and graph genomes reveal missing signal in epigenomic data.个性化和图形基因组揭示了表观基因组数据中的缺失信号。
Genome Biol. 2020 May 25;21(1):124. doi: 10.1186/s13059-020-02038-8.
8
Assessing graph-based read mappers against a baseline approach highlights strengths and weaknesses of current methods.评估基于图的读映射器相对于基线方法,可以突出当前方法的优缺点。
BMC Genomics. 2020 Apr 6;21(1):282. doi: 10.1186/s12864-020-6685-y.
9
Efficient Construction of a Complete Index for Pan-Genomics Read Alignment.高效构建全基因组读段比对的完整索引。
J Comput Biol. 2020 Apr;27(4):500-513. doi: 10.1089/cmb.2019.0309. Epub 2020 Mar 16.
10
Variant calling on the GRCh38 assembly with the data from phase three of the 1000 Genomes Project.使用千人基因组计划第三阶段的数据在GRCh38装配上进行变异检测。
Wellcome Open Res. 2019 Dec 30;4:50. doi: 10.12688/wellcomeopenres.15126.2. eCollection 2019.