• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在比较性RNA测序研究中,读取映射到非天然参考基因组的定量影响。

The quantitative impact of read mapping to non-native reference genomes in comparative RNA-Seq studies.

作者信息

Price Adam, Gibas Cynthia

机构信息

Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, North Carolina, United States of America.

出版信息

PLoS One. 2017 Jul 11;12(7):e0180904. doi: 10.1371/journal.pone.0180904. eCollection 2017.

DOI:10.1371/journal.pone.0180904
PMID:28700635
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5507458/
Abstract

Sequence read alignment to a reference genome is a fundamental step in many genomics studies. Accuracy in this fundamental step is crucial for correct interpretation of biological data. In cases where two or more closely related bacterial strains are being studied, a common approach is to simply map reads from all strains to a common reference genome, whether because there is no closed reference for some strains or for ease of comparison. The assumption is that the differences between bacterial strains are insignificant enough that the results of differential expression analysis will not be influenced by choice of reference. Genes that are common among the strains under study are used for differential expression analysis, while the remaining genes, which may fail to express in one sample or the other because they are simply absent, are analyzed separately. In this study, we investigate the practice of using a common reference in transcriptomic analysis. We analyze two multi-strain transcriptomic data sets that were initially presented in the literature as comparisons based on a common reference, but which have available closed genomic sequence for all strains, allowing a detailed examination of the impact of reference choice. We provide a method for identifying regions that are most affected by non-native alignments, leading to false positives in differential expression analysis, and perform an in depth analysis identifying the extent of expression loss. We also simulate several data sets to identify best practices for non-native reference use.

摘要

将序列读数比对到参考基因组是许多基因组学研究中的基本步骤。这一基本步骤的准确性对于正确解读生物学数据至关重要。在研究两种或更多密切相关的细菌菌株的情况下,一种常见的方法是简单地将所有菌株的读数映射到一个共同的参考基因组,无论是因为某些菌株没有完整的参考基因组,还是为了便于比较。其假设是细菌菌株之间的差异足够小,以至于差异表达分析的结果不会受到参考选择的影响。在所研究的菌株中常见的基因用于差异表达分析,而其余的基因,可能由于根本不存在而在一个样本或另一个样本中未能表达,则单独进行分析。在本研究中,我们调查了在转录组分析中使用共同参考的做法。我们分析了两个多菌株转录组数据集,这些数据集最初在文献中是以基于共同参考的比较形式呈现的,但所有菌株都有可用的完整基因组序列,这使得我们能够详细检查参考选择的影响。我们提供了一种方法来识别受非天然比对影响最大的区域,这些区域会导致差异表达分析中的假阳性,并进行深入分析以确定表达损失的程度。我们还模拟了几个数据集,以确定使用非天然参考的最佳做法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/ffb156212659/pone.0180904.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/4bacfb862ad9/pone.0180904.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/3c4266d99c96/pone.0180904.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/b048c76928df/pone.0180904.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/3a663364f9ad/pone.0180904.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/916e006194f1/pone.0180904.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/99fdd626b943/pone.0180904.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/af77375efae9/pone.0180904.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/b46034637b7c/pone.0180904.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/b2563c254556/pone.0180904.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/21b2957829df/pone.0180904.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/0d654410986e/pone.0180904.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/ffb156212659/pone.0180904.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/4bacfb862ad9/pone.0180904.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/3c4266d99c96/pone.0180904.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/b048c76928df/pone.0180904.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/3a663364f9ad/pone.0180904.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/916e006194f1/pone.0180904.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/99fdd626b943/pone.0180904.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/af77375efae9/pone.0180904.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/b46034637b7c/pone.0180904.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/b2563c254556/pone.0180904.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/21b2957829df/pone.0180904.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/0d654410986e/pone.0180904.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3782/5507458/ffb156212659/pone.0180904.g012.jpg

相似文献

1
The quantitative impact of read mapping to non-native reference genomes in comparative RNA-Seq studies.在比较性RNA测序研究中,读取映射到非天然参考基因组的定量影响。
PLoS One. 2017 Jul 11;12(7):e0180904. doi: 10.1371/journal.pone.0180904. eCollection 2017.
2
A comprehensive evaluation of ensembl, RefSeq, and UCSC annotations in the context of RNA-seq read mapping and gene quantification.在RNA测序读段映射和基因定量的背景下,对Ensembl、RefSeq和UCSC注释进行全面评估。
BMC Genomics. 2015 Feb 18;16(1):97. doi: 10.1186/s12864-015-1308-8.
3
Discovering functional linkages and uncharacterized cellular pathways using phylogenetic profile comparisons: a comprehensive assessment.利用系统发育谱比较发现功能联系和未表征的细胞途径:一项综合评估。
BMC Bioinformatics. 2007 May 23;8:173. doi: 10.1186/1471-2105-8-173.
4
The plasticity of global proteome and genome expression analyzed in closely related W3110 and MG1655 strains of a well-studied model organism, Escherichia coli-K12.在深入研究的模式生物大肠杆菌-K12的密切相关菌株W3110和MG1655中分析的全球蛋白质组和基因组表达的可塑性。
J Biotechnol. 2007 Mar 10;128(4):747-61. doi: 10.1016/j.jbiotec.2006.12.026. Epub 2007 Jan 14.
5
Lessons for livestock genomics from genome and transcriptome sequencing in cattle and other mammals.牛及其他哺乳动物的基因组和转录组测序为家畜基因组学带来的启示。
Genet Sel Evol. 2016 Aug 17;48(1):59. doi: 10.1186/s12711-016-0237-6.
6
Discover hidden splicing variations by mapping personal transcriptomes to personal genomes.通过将个人转录组与个人基因组进行比对,发现隐藏的剪接变异。
Nucleic Acids Res. 2015 Dec 15;43(22):10612-22. doi: 10.1093/nar/gkv1099. Epub 2015 Nov 17.
7
SNP calling from RNA-seq data without a reference genome: identification, quantification, differential analysis and impact on the protein sequence.无参考基因组情况下从RNA测序数据中进行单核苷酸多态性(SNP)检测:鉴定、定量、差异分析及其对蛋白质序列的影响
Nucleic Acids Res. 2016 Nov 2;44(19):e148. doi: 10.1093/nar/gkw655. Epub 2016 Jul 25.
8
Characterizing and annotating the genome using RNA-seq data.利用RNA测序数据对基因组进行特征描述和注释。
Sci China Life Sci. 2017 Feb;60(2):116-125. doi: 10.1007/s11427-015-0349-4. Epub 2016 Jun 13.
9
Whole transcriptome analysis of Penicillium digitatum strains treatmented with prochloraz reveals their drug-resistant mechanisms.用咪鲜胺处理的指状青霉菌株的全转录组分析揭示了它们的耐药机制。
BMC Genomics. 2015 Oct 24;16:855. doi: 10.1186/s12864-015-2043-x.
10
RNA-Seq read alignments with PALMapper.使用PALMapper进行RNA-Seq读段比对。
Curr Protoc Bioinformatics. 2010 Dec;Chapter 11:Unit 11.6. doi: 10.1002/0471250953.bi1106s32.

引用本文的文献

1
DNA methylation analysis to differentiate reference, breed, and parent-of-origin effects in the bovine pangenome era.在牛泛基因组时代,通过 DNA 甲基化分析来区分参考、品种和亲本来源效应。
Gigascience. 2024 Jan 2;13. doi: 10.1093/gigascience/giae061.
2
Using genome-wide expression compendia to study microorganisms.利用全基因组表达汇编研究微生物。
Comput Struct Biotechnol J. 2022 Aug 10;20:4315-4324. doi: 10.1016/j.csbj.2022.08.012. eCollection 2022.
3
Triple RNA-Seq characterizes aphid gene expression in response to infection with unequally virulent strains of the endosymbiont Hamiltonella defensa.

本文引用的文献

1
Simulome: a genome sequence and variant simulator.模拟基因组:一种基因组序列和变异模拟器。
Bioinformatics. 2017 Jun 15;33(12):1876-1878. doi: 10.1093/bioinformatics/btx091. Epub 2017 Feb 10.
2
Integrated genome browser: visual analytics platform for genomics.整合基因组浏览器:用于基因组学的可视化分析平台。
Bioinformatics. 2016 Jul 15;32(14):2089-95. doi: 10.1093/bioinformatics/btw069. Epub 2016 Mar 16.
3
RegulonDB version 9.0: high-level integration of gene regulation, coexpression, motif clustering and beyond.RegulonDB 9.0版本:基因调控、共表达、基序聚类及其他方面的高级整合。
三重 RNA 测序描绘了蚜虫基因表达对共生菌 Hamiltonella defensa 不同毒力菌株感染的反应特征。
BMC Genomics. 2021 Jun 16;22(1):449. doi: 10.1186/s12864-021-07742-8.
4
Genomic diversity affects the accuracy of bacterial single-nucleotide polymorphism-calling pipelines.基因组多样性影响细菌单核苷酸多态性 calling 管道的准确性。
Gigascience. 2020 Feb 1;9(2). doi: 10.1093/gigascience/giaa007.
5
Iso-Seq Allows Genome-Independent Transcriptome Profiling of Grape Berry Development.Iso-Seq技术助力葡萄浆果发育过程中不依赖基因组的转录组分析。
G3 (Bethesda). 2019 Mar 7;9(3):755-767. doi: 10.1534/g3.118.201008.
Nucleic Acids Res. 2016 Jan 4;44(D1):D133-43. doi: 10.1093/nar/gkv1156. Epub 2015 Nov 2.
4
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.使用DESeq2对RNA测序数据的倍数变化和离散度进行适度估计。
Genome Biol. 2014;15(12):550. doi: 10.1186/s13059-014-0550-8.
5
Transcriptome sequencing reveals the virulence and environmental genetic programs of Vibrio vulnificus exposed to host and estuarine conditions.转录组测序揭示了创伤弧菌在宿主和河口环境条件下的毒力及环境遗传程序。
PLoS One. 2014 Dec 9;9(12):e114376. doi: 10.1371/journal.pone.0114376. eCollection 2014.
6
Gene expression analysis of E. coli strains provides insights into the role of gene regulation in diversification.大肠杆菌菌株的基因表达分析为深入了解基因调控在多样化过程中的作用提供了线索。
ISME J. 2015 May;9(5):1130-40. doi: 10.1038/ismej.2014.204. Epub 2014 Oct 24.
7
The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote.Subread 比对工具:基于种子投票的快速、准确和可扩展的读段比对。
Nucleic Acids Res. 2013 May 1;41(10):e108. doi: 10.1093/nar/gkt214. Epub 2013 Apr 4.
8
Fast gapped-read alignment with Bowtie 2.快速缺口读对准与 Bowtie 2。
Nat Methods. 2012 Mar 4;9(4):357-9. doi: 10.1038/nmeth.1923.
9
ART: a next-generation sequencing read simulator.ART:一种新一代测序读模拟程序。
Bioinformatics. 2012 Feb 15;28(4):593-4. doi: 10.1093/bioinformatics/btr708. Epub 2011 Dec 23.
10
Gene expression profiling of resistant and susceptible Bombyx mori strains reveals nucleopolyhedrovirus-associated variations in host gene transcript levels.抗性和敏感家蚕品系的基因表达谱揭示了宿主基因转录水平上与核型多角体病毒相关的差异。
Genomics. 2009 Aug;94(2):138-45. doi: 10.1016/j.ygeno.2009.04.003. Epub 2009 Apr 21.