• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于高通量测序数据的新型多序列比对流程。

A novel multi-alignment pipeline for high-throughput sequencing data.

作者信息

Huang Shunping, Holt James, Kao Chia-Yu, McMillan Leonard, Wang Wei

机构信息

Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599, Department of Computer Science, University of California, Los Angeles, CA 90095, USA.

Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599, Department of Computer Science, University of California, Los Angeles, CA 90095, USA

出版信息

Database (Oxford). 2014 Jun 18;2014. doi: 10.1093/database/bau057. Print 2014.

DOI:10.1093/database/bau057
PMID:24948510
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4062837/
Abstract

Mapping reads to a reference sequence is a common step when analyzing allele effects in high-throughput sequencing data. The choice of reference is critical because its effect on quantitative sequence analysis is non-negligible. Recent studies suggest aligning to a single standard reference sequence, as is common practice, can lead to an underlying bias depending on the genetic distances of the target sequences from the reference. To avoid this bias, researchers have resorted to using modified reference sequences. Even with this improvement, various limitations and problems remain unsolved, which include reduced mapping ratios, shifts in read mappings and the selection of which variants to include to remove biases. To address these issues, we propose a novel and generic multi-alignment pipeline. Our pipeline integrates the genomic variations from known or suspected founders into separate reference sequences and performs alignments to each one. By mapping reads to multiple reference sequences and merging them afterward, we are able to rescue more reads and diminish the bias caused by using a single common reference. Moreover, the genomic origin of each read is determined and annotated during the merging process, providing a better source of information to assess differential expression than simple allele queries at known variant positions. Using RNA-seq of a diallel cross, we compare our pipeline with the single-reference pipeline and demonstrate our advantages of more aligned reads and a higher percentage of reads with assigned origins. Database URL: http://csbio.unc.edu/CCstatus/index.py?run=Pseudo.

摘要

在分析高通量测序数据中的等位基因效应时,将 reads 映射到参考序列是一个常见步骤。参考序列的选择至关重要,因为其对定量序列分析的影响不可忽视。最近的研究表明,按照常规做法与单个标准参考序列进行比对,可能会导致潜在偏差,这取决于目标序列与参考序列的遗传距离。为避免这种偏差,研究人员已采用修改后的参考序列。即便有了这一改进,各种局限性和问题仍未解决,其中包括映射率降低、读取映射偏移以及选择要纳入哪些变体以消除偏差。为解决这些问题,我们提出了一种新颖且通用的多比对流程。我们的流程将来自已知或疑似奠基者的基因组变异整合到单独的参考序列中,并对每个序列进行比对。通过将 reads 映射到多个参考序列并随后合并它们,我们能够挽救更多 reads,并减少因使用单个通用参考序列而导致的偏差。此外,在合并过程中确定并注释每个 read 的基因组来源,这比在已知变异位置进行简单的等位基因查询能提供更好的信息来源以评估差异表达。使用双列杂交的 RNA-seq,我们将我们的流程与单参考序列流程进行比较,并展示了我们在更多比对 reads 和更高比例有指定来源的 reads 方面的优势。数据库网址:http://csbio.unc.edu/CCstatus/index.py?run=Pseudo

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cc97/4062837/5fdf2234999f/bau057f7p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cc97/4062837/19d75ddcf305/bau057f1p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cc97/4062837/f91fa43382b0/bau057f2p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cc97/4062837/d886770786dc/bau057f3p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cc97/4062837/b8718cc3ac6b/bau057f4p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cc97/4062837/ae5de71c9ea0/bau057f5p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cc97/4062837/42d122c2752f/bau057f6p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cc97/4062837/5fdf2234999f/bau057f7p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cc97/4062837/19d75ddcf305/bau057f1p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cc97/4062837/f91fa43382b0/bau057f2p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cc97/4062837/d886770786dc/bau057f3p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cc97/4062837/b8718cc3ac6b/bau057f4p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cc97/4062837/ae5de71c9ea0/bau057f5p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cc97/4062837/42d122c2752f/bau057f6p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cc97/4062837/5fdf2234999f/bau057f7p.jpg

相似文献

1
A novel multi-alignment pipeline for high-throughput sequencing data.一种用于高通量测序数据的新型多序列比对流程。
Database (Oxford). 2014 Jun 18;2014. doi: 10.1093/database/bau057. Print 2014.
2
Ψ-RA: a parallel sparse index for genomic read alignment.Ψ-RA:一种用于基因组读取比对的并行稀疏索引。
BMC Genomics. 2011;12 Suppl 2(Suppl 2):S7. doi: 10.1186/1471-2164-12-S2-S7. Epub 2011 Jul 27.
3
Enhancing SNV identification in whole-genome sequencing data through the incorporation of known genetic variants into the minimap2 index.通过将已知遗传变异纳入 minimap2 索引来提高全基因组测序数据中 SNV 的识别能力。
BMC Bioinformatics. 2024 Jul 13;25(1):238. doi: 10.1186/s12859-024-05862-y.
4
BacTag - a pipeline for fast and accurate gene and allele typing in bacterial sequencing data based on database preprocessing.BacTag - 一种基于数据库预处理的快速准确的细菌测序数据中基因和等位基因分型的流水线。
BMC Genomics. 2019 May 6;20(1):338. doi: 10.1186/s12864-019-5723-0.
5
Accurate estimation of short read mapping quality for next-generation genome sequencing.准确估计下一代基因组测序中短读测序数据的映射质量。
Bioinformatics. 2012 Sep 15;28(18):i349-i355. doi: 10.1093/bioinformatics/bts408.
6
Demonstrating the utility of flexible sequence queries against indexed short reads with FlexTyper.使用 FlexTyper 对索引短读取进行灵活序列查询的实用性展示。
PLoS Comput Biol. 2021 Mar 22;17(3):e1008815. doi: 10.1371/journal.pcbi.1008815. eCollection 2021 Mar.
7
ReviSTER: an automated pipeline to revise misaligned reads to simple tandem repeats.ReviSTER:一种自动流水线,用于修正未对齐的读取到简单串联重复序列。
Bioinformatics. 2013 Jul 15;29(14):1734-41. doi: 10.1093/bioinformatics/btt277. Epub 2013 May 15.
8
BlackOPs: increasing confidence in variant detection through mappability filtering.BlackOPs:通过可映射性过滤提高变异检测的置信度。
Nucleic Acids Res. 2013 Oct;41(19):e178. doi: 10.1093/nar/gkt692. Epub 2013 Aug 8.
9
An approximate Bayesian approach for mapping paired-end DNA reads to a reference genome.一种将双端 DNA 读取映射到参考基因组的近似贝叶斯方法。
Bioinformatics. 2013 Apr 15;29(8):965-72. doi: 10.1093/bioinformatics/btt073. Epub 2013 Feb 14.
10
Systematic benchmark of ancient DNA read mapping.系统评估古 DNA 读段映射。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab076.

引用本文的文献

1
Phylogeography of introgression: Spatial and temporal analyses identify two introgression events between brown and American black bears.基因渗入的系统发育地理学:时空分析确定了棕熊和美洲黑熊之间的两次基因渗入事件。
Heredity (Edinb). 2025 Apr 19. doi: 10.1038/s41437-025-00762-0.
2
Different complex regulatory phenotypes underlie hybrid male sterility in divergent rodent crosses.不同的复杂调控表型是不同啮齿动物杂交中杂种雄性不育的基础。
Genetics. 2025 Feb 5;229(2). doi: 10.1093/genetics/iyae198.
3
An integrated transcription factor framework for Treg identity and diversity.

本文引用的文献

1
A new strategy to reduce allelic bias in RNA-Seq readmapping.一种减少 RNA-Seq 读段比对中等位基因偏倚的新策略。
Nucleic Acids Res. 2012 Sep;40(16):e127. doi: 10.1093/nar/gks425. Epub 2012 May 14.
2
Fast gapped-read alignment with Bowtie 2.快速缺口读对准与 Bowtie 2。
Nat Methods. 2012 Mar 4;9(4):357-9. doi: 10.1038/nmeth.1923.
3
Read count approach for DNA copy number variants detection.读长计数法用于检测 DNA 拷贝数变异。
Treg 身份和多样性的综合转录因子框架。
Proc Natl Acad Sci U S A. 2024 Sep 3;121(36):e2411301121. doi: 10.1073/pnas.2411301121. Epub 2024 Aug 28.
4
Biparental graph strategy to represent and analyze hybrid plant genomes.双亲图谱策略表示和分析杂种植物基因组。
Plant Physiol. 2024 Oct 1;196(2):1284-1297. doi: 10.1093/plphys/kiae375.
5
AIRE relies on Z-DNA to flag gene targets for thymic T cell tolerization.AIRE 依赖于 Z-DNA 来标记胸腺 T 细胞耐受的基因靶标。
Nature. 2024 Apr;628(8007):400-407. doi: 10.1038/s41586-024-07169-7. Epub 2024 Mar 13.
6
Different complex regulatory phenotypes underlie hybrid male sterility in divergent rodent crosses.不同的复杂调控表型是不同啮齿动物杂交中杂种雄性不育的基础。
bioRxiv. 2024 Nov 11:2023.10.30.564782. doi: 10.1101/2023.10.30.564782.
7
Cystathionine β-synthase as novel endogenous regulator of lymphangiogenesis via modulating VEGF receptor 2 and 3.胱硫醚-β-合酶作为新型内源性淋巴管生成调节剂,通过调节血管内皮生长因子受体 2 和 3。
Commun Biol. 2022 Sep 10;5(1):950. doi: 10.1038/s42003-022-03923-7.
8
Characterization of sequence determinants of enhancer function using natural genetic variation.利用自然遗传变异对增强子功能的序列决定因素进行表征。
Elife. 2022 Aug 31;11:e76500. doi: 10.7554/eLife.76500.
9
Molecular Evolution across Mouse Spermatogenesis.精子发生过程中的分子进化。
Mol Biol Evol. 2022 Feb 3;39(2). doi: 10.1093/molbev/msac023.
10
Unraveling patterns of disrupted gene expression across a complex tissue.揭示复杂组织中基因表达失调的模式。
Evolution. 2022 Feb;76(2):275-291. doi: 10.1111/evo.14420. Epub 2022 Jan 7.
Bioinformatics. 2012 Feb 15;28(4):470-8. doi: 10.1093/bioinformatics/btr707. Epub 2011 Dec 23.
4
GENE-counter: a computational pipeline for the analysis of RNA-Seq data for gene expression differences.GENE-counter:一个用于分析 RNA-Seq 数据以检测基因表达差异的计算流程。
PLoS One. 2011;6(10):e25279. doi: 10.1371/journal.pone.0025279. Epub 2011 Oct 6.
5
Mouse genomic variation and its effect on phenotypes and gene regulation.小鼠基因组变异及其对表型和基因调控的影响。
Nature. 2011 Sep 14;477(7364):289-94. doi: 10.1038/nature10413.
6
AlleleSeq: analysis of allele-specific expression and binding in a network framework.AlleleSeq:在网络框架中分析等位基因特异性表达和结合。
Mol Syst Biol. 2011 Aug 2;7:522. doi: 10.1038/msb.2011.54.
7
The variant call format and VCFtools.变异调用格式和 VCFtools。
Bioinformatics. 2011 Aug 1;27(15):2156-8. doi: 10.1093/bioinformatics/btr330. Epub 2011 Jun 7.
8
Haplotype and isoform specific expression estimation using multi-mapping RNA-seq reads.利用多映射 RNA-seq reads 进行单倍型和异构体特异性表达估计。
Genome Biol. 2011;12(2):R13. doi: 10.1186/gb-2011-12-2-r13. Epub 2011 Feb 10.
9
High-resolution analysis of parent-of-origin allelic expression in the mouse brain.高分辨率分析小鼠大脑中亲本来源等位基因的表达。
Science. 2010 Aug 6;329(5992):643-8. doi: 10.1126/science.1190830. Epub 2010 Jul 8.
10
Prediction of alternative isoforms from exon expression levels in RNA-Seq experiments.从 RNA-Seq 实验中的exon 表达水平预测替代异构体。
Nucleic Acids Res. 2010 Jun;38(10):e112. doi: 10.1093/nar/gkq041. Epub 2010 Feb 11.