• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

GATK联合基因分型工作流程适用于在RNA测序实验中检测变异。

The GATK joint genotyping workflow is appropriate for calling variants in RNA-seq experiments.

作者信息

Brouard Jean-Simon, Schenkel Flavio, Marete Andrew, Bissonnette Nathalie

机构信息

1Sherbrooke Research and Development Centre, Agriculture and Agri-Food Canada, Sherbrooke, QC J1M 0C8 Canada.

2Center of Genetic Improvement of Livestock, University of Guelph, Guelph, ON N1G 2W1 Canada.

出版信息

J Anim Sci Biotechnol. 2019 Jun 21;10:44. doi: 10.1186/s40104-019-0359-0. eCollection 2019.

DOI:10.1186/s40104-019-0359-0
PMID:31249686
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6587293/
Abstract

The Genome Analysis Toolkit (GATK) is a popular set of programs for discovering and genotyping variants from next-generation sequencing data. The current GATK recommendation for RNA sequencing (RNA-seq) is to perform variant calling from individual samples, with the drawback that only variable positions are reported. Versions 3.0 and above of GATK offer the possibility of calling DNA variants on cohorts of samples using the HaplotypeCaller algorithm in Genomic Variant Call Format (GVCF) mode. Using this approach, variants are called individually on each sample, generating one GVCF file per sample that lists genotype likelihoods and their genome annotations. In a second step, variants are called from the GVCF files through a joint genotyping analysis. This strategy is more flexible and reduces computational challenges in comparison to the traditional joint discovery workflow. Using a GVCF workflow for mining SNP in RNA-seq data provides substantial advantages, including reporting homozygous genotypes for the reference allele as well as missing data. Taking advantage of RNA-seq data derived from primary macrophages isolated from 50 cows, the GATK joint genotyping method for calling variants on RNA-seq data was validated by comparing this approach to a so-called "per-sample" method. In addition, pair-wise comparisons of the two methods were performed to evaluate their respective sensitivity, precision and accuracy using DNA genotypes from a companion study including the same 50 cows genotyped using either genotyping-by-sequencing or with the Bovine SNP50 Beadchip (imputed to the Bovine high density). Results indicate that both approaches are very close in their capacity of detecting reference variants and that the joint genotyping method is more sensitive than the per-sample method. Given that the joint genotyping method is more flexible and technically easier, we recommend this approach for variant calling in RNA-seq experiments.

摘要

基因组分析工具包(GATK)是一套用于从下一代测序数据中发现和基因分型变异的常用程序。目前GATK对RNA测序(RNA-seq)的建议是对单个样本进行变异检测,缺点是只报告可变位置。GATK 3.0及以上版本提供了在基因组变异调用格式(GVCF)模式下使用单倍型分型算法对样本队列进行DNA变异检测的可能性。使用这种方法,对每个样本单独进行变异检测,每个样本生成一个GVCF文件,列出基因型似然性及其基因组注释。第二步,通过联合基因分型分析从GVCF文件中检测变异。与传统的联合发现工作流程相比,这种策略更灵活,减少了计算挑战。使用GVCF工作流程挖掘RNA-seq数据中的单核苷酸多态性(SNP)具有显著优势,包括报告参考等位基因的纯合基因型以及缺失数据。利用从50头奶牛分离的原代巨噬细胞获得的RNA-seq数据,通过将该方法与所谓的“单样本”方法进行比较,验证了GATK联合基因分型方法在RNA-seq数据上检测变异的能力。此外,使用来自一项配套研究的DNA基因型对这两种方法进行成对比较,该研究对包括这50头奶牛在内的样本分别使用测序分型或牛SNP50芯片(推算为牛高密度芯片)进行基因分型,以评估它们各自的灵敏度、精密度和准确性。结果表明,两种方法在检测参考变异的能力上非常接近,联合基因分型方法比单样本方法更灵敏。鉴于联合基因分型方法更灵活且技术上更简便,我们建议在RNA-seq实验中使用这种方法进行变异检测。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c03c/6587293/8e89f5bcaed7/40104_2019_359_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c03c/6587293/97fb2348cada/40104_2019_359_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c03c/6587293/8e89f5bcaed7/40104_2019_359_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c03c/6587293/97fb2348cada/40104_2019_359_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c03c/6587293/8e89f5bcaed7/40104_2019_359_Fig2_HTML.jpg

相似文献

1
The GATK joint genotyping workflow is appropriate for calling variants in RNA-seq experiments.GATK联合基因分型工作流程适用于在RNA测序实验中检测变异。
J Anim Sci Biotechnol. 2019 Jun 21;10:44. doi: 10.1186/s40104-019-0359-0. eCollection 2019.
2
Variant Calling from RNA-seq Data Using the GATK Joint Genotyping Workflow.使用 GATK 联合基因分型工作流程进行 RNA-seq 数据的变异调用。
Methods Mol Biol. 2022;2493:205-233. doi: 10.1007/978-1-0716-2293-3_13.
3
An optimized genomic VCF workflow for precise identification of Mycobacterium tuberculosis cluster from cross-platform whole genome sequencing data.一种优化的基因组 VCF 工作流程,用于从跨平台全基因组测序数据中精确鉴定结核分枝杆菌簇。
Infect Genet Evol. 2020 Apr;79:104152. doi: 10.1016/j.meegid.2019.104152. Epub 2019 Dec 24.
4
OVarFlow: a resource optimized GATK 4 based Open source Variant calling workFlow.OVarFlow:一种基于资源优化的 GATK4 的开源变异调用工作流程。
BMC Bioinformatics. 2021 Aug 13;22(1):402. doi: 10.1186/s12859-021-04317-y.
5
Detailed comparison of two popular variant calling packages for exome and targeted exon studies.详细比较两种用于外显子组和靶向外显子研究的流行变异调用包。
PeerJ. 2014 Sep 30;2:e600. doi: 10.7717/peerj.600. eCollection 2014.
6
Low-depth genotyping-by-sequencing (GBS) in a bovine population: strategies to maximize the selection of high quality genotypes and the accuracy of imputation.牛群中的低深度测序基因分型(GBS):最大化高质量基因型选择和归因准确性的策略。
BMC Genet. 2017 Apr 5;18(1):32. doi: 10.1186/s12863-017-0501-y.
7
Making the most of RNA-seq: Pre-processing sequencing data with Opossum for reliable SNP variant detection.充分利用RNA测序:使用负鼠预处理测序数据以进行可靠的单核苷酸多态性(SNP)变体检测。
Wellcome Open Res. 2017 Jan 17;2:6. doi: 10.12688/wellcomeopenres.10501.2.
8
An analytical workflow for accurate variant discovery in highly divergent regions.一种用于在高度分化区域进行准确变异发现的分析流程。
BMC Genomics. 2016 Sep 2;17(1):703. doi: 10.1186/s12864-016-3045-z.
9
RNA-Seq Data for Reliable SNP Detection and Genotype Calling: Interest for Coding Variant Characterization and -Regulation Analysis by Allele-Specific Expression in Livestock Species.用于可靠单核苷酸多态性检测和基因型分型的RNA测序数据:对家畜物种中编码变异特征鉴定及通过等位基因特异性表达进行调控分析的意义
Front Genet. 2021 Jun 28;12:655707. doi: 10.3389/fgene.2021.655707. eCollection 2021.
10
Comparison among three variant callers and assessment of the accuracy of imputation from SNP array data to whole-genome sequence level in chicken.鸡中三种变异检测工具的比较以及从SNP芯片数据到全基因组序列水平的填充准确性评估。
BMC Genomics. 2015 Oct 21;16:824. doi: 10.1186/s12864-015-2059-2.

引用本文的文献

1
RnaXtract, a tool for extracting gene expression, variants, and cell-type composition from bulk RNA sequencing.RnaXtract,一种用于从大量RNA测序中提取基因表达、变异体和细胞类型组成的工具。
Sci Rep. 2025 Aug 24;15(1):31100. doi: 10.1038/s41598-025-16875-9.
2
Resolving early embryonic metabolism in Drosophila through single-embryo metabolomics and transcriptomics.通过单胚胎代谢组学和转录组学解析果蝇早期胚胎代谢
Nat Metab. 2025 Aug 13. doi: 10.1038/s42255-025-01351-5.
3
Integrated morphophysiological, transcriptomic, and metabolomic data uncover the molecular mechanism of environmental adaptation of with different latitudinal gradients.

本文引用的文献

1
Accuracy of RNAseq based SNP discovery and genotyping in Populusnigra.基于RNA测序的黑杨单核苷酸多态性发现与基因分型的准确性
BMC Genomics. 2018 Dec 12;19(1):909. doi: 10.1186/s12864-018-5239-z.
2
Low-depth genotyping-by-sequencing (GBS) in a bovine population: strategies to maximize the selection of high quality genotypes and the accuracy of imputation.牛群中的低深度测序基因分型(GBS):最大化高质量基因型选择和归因准确性的策略。
BMC Genet. 2017 Apr 5;18(1):32. doi: 10.1186/s12863-017-0501-y.
3
Integrating Sequence-based GWAS and RNA-Seq Provides Novel Insights into the Genetic Basis of Mastitis and Milk Production in Dairy Cattle.
整合形态生理学、转录组学和代谢组学数据揭示了不同纬度梯度下[物种名称未给出]环境适应的分子机制。
Front Plant Sci. 2025 Jul 1;16:1622956. doi: 10.3389/fpls.2025.1622956. eCollection 2025.
4
Population Phylogenomics and Genetic Structure of the Polyphagous Leafminer, (Burgess) (Diptera: Agromyzidae).多食性潜叶蝇(Burgess)(双翅目:潜蝇科)的群体系统基因组学与遗传结构
Evol Appl. 2025 Jul 9;18(7):e70132. doi: 10.1111/eva.70132. eCollection 2025 Jul.
5
Reidentification of hybridization events with transcriptomic data and phylogenomic study in Seabuckthorn.利用转录组数据和系统发育基因组学研究重新鉴定沙棘中的杂交事件。
Sci Rep. 2025 Jul 6;15(1):24121. doi: 10.1038/s41598-025-09923-x.
6
Root Transcriptome Analysis Identifies Salt-Tolerance Genes in Sweet Corn Chromosome Segment Substitution Lines (CSSLs).根系转录组分析鉴定甜玉米染色体片段代换系(CSSLs)中的耐盐基因。
Plants (Basel). 2025 May 31;14(11):1687. doi: 10.3390/plants14111687.
7
Autism-associated deficiency disrupts cortico-striatal circuitry in human brain assembloids.与自闭症相关的缺陷扰乱了人脑类器官中的皮质-纹状体回路。
bioRxiv. 2025 Jun 3:2025.06.02.657036. doi: 10.1101/2025.06.02.657036.
8
Genes and pathways determining flowering time variation in temperate-adapted sorghum.决定温带适应性高粱开花时间变异的基因和途径。
Plant J. 2025 Jun;122(5):e70250. doi: 10.1111/tpj.70250.
9
A BRASSINOSTEROID INSENSISTIVE 1 receptor kinase ortholog is required for sex determination in Ceratopteris richardii.一种油菜素类固醇不敏感1受体激酶直系同源物是里氏水蓑衣性别决定所必需的。
Plant Cell. 2025 May 9;37(5). doi: 10.1093/plcell/koaf058.
10
Genotype inference from aggregated chromatin accessibility data reveals genetic regulatory mechanisms.从聚合染色质可及性数据推断基因型揭示了基因调控机制。
Genome Biol. 2025 Mar 30;26(1):81. doi: 10.1186/s13059-025-03538-1.
基于序列的 GWAS 与 RNA-Seq 的整合为奶牛乳腺炎和产奶量的遗传基础提供了新的见解。
Sci Rep. 2017 Mar 30;7:45560. doi: 10.1038/srep45560.
4
Variant discovery in the sheep milk transcriptome using RNA sequencing.利用RNA测序技术在绵羊乳转录组中发现变异体
BMC Genomics. 2017 Feb 15;18(1):170. doi: 10.1186/s12864-017-3581-1.
5
Using RNA-Seq SNP data to reveal potential causal mutations related to pig production traits and RNA editing.利用RNA测序SNP数据揭示与猪生产性状和RNA编辑相关的潜在因果突变。
Anim Genet. 2017 Apr;48(2):151-165. doi: 10.1111/age.12507. Epub 2016 Sep 18.
6
Single Nucleotide Polymorphism Discovery in Bovine Pituitary Gland Using RNA-Seq Technology.利用RNA测序技术发现牛垂体中的单核苷酸多态性
PLoS One. 2016 Sep 8;11(9):e0161370. doi: 10.1371/journal.pone.0161370. eCollection 2016.
7
Scanning and Filling: Ultra-Dense SNP Genotyping Combining Genotyping-By-Sequencing, SNP Array and Whole-Genome Resequencing Data.扫描与填充:结合简化基因组测序、SNP芯片和全基因组重测序数据的超密集SNP基因分型
PLoS One. 2015 Jul 10;10(7):e0131533. doi: 10.1371/journal.pone.0131533. eCollection 2015.
8
Evaluation of variant identification methods for whole genome sequencing data in dairy cattle.奶牛全基因组测序数据变异识别方法的评估
BMC Genomics. 2014 Nov 1;15(1):948. doi: 10.1186/1471-2164-15-948.
9
Strategies for imputation to whole genome sequence using a single or multi-breed reference population in cattle.利用单一或多品种参考群体对牛全基因组序列进行填充的策略。
BMC Genomics. 2014 Aug 27;15(1):728. doi: 10.1186/1471-2164-15-728.
10
Extent of linkage disequilibrium, consistency of gametic phase, and imputation accuracy within and across Canadian dairy breeds.加拿大奶牛品种内和品种间的连锁不平衡程度、配子相位一致性及填充准确性。
J Dairy Sci. 2014 May;97(5):3128-41. doi: 10.3168/jds.2013-6826. Epub 2014 Feb 26.