• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于 RNA-seq 数据的高通量 SNP 发现策略。

A high-throughput SNP discovery strategy for RNA-seq data.

机构信息

Zhejiang Provincial Key Laboratory of Horticultural Plant Integrative Biology, Zhejiang University, Zijingang Campus, Hangzhou, China.

出版信息

BMC Genomics. 2019 Feb 27;20(1):160. doi: 10.1186/s12864-019-5533-4.

DOI:10.1186/s12864-019-5533-4
PMID:30813897
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6391812/
Abstract

BACKGROUND

Single nucleotide polymorphisms (SNP) have been applied as important molecular markers in genetics and breeding studies. The rapid advance of next generation sequencing (NGS) provides a high-throughput means of SNP discovery. However, SNP development is limited by the availability of reliable SNP discovery methods. Especially, the optimum assembler and SNP caller for accurate SNP prediction from next generation sequencing data are not known.

RESULTS

Herein we performed SNP prediction based on RNA-seq data of peach and mandarin peel tissue under a comprehensive comparison of two paired-end read lengths (125 bp and 150 bp), five assemblers (Trinity, IDBA, oases, SOAPdenovo, Trans-abyss) and two SNP callers (GATK and GBS). The predicted SNPs were compared with the authentic SNPs identified via PCR amplification followed by gene cloning and sequencing procedures. A total of 40 and 240 authentic SNPs were presented in five anthocyanin biosynthesis related genes in peach and in nine carotenogenic genes in mandarin. Putative SNPs predicted from the same RNA-seq data with different strategies led to quite divergent results. The rate of false positive SNPs was significantly lower when the paired-end read length was 150 bp compared with 125 bp. Trinity was superior to the other four assemblers and GATK was substantially superior to GBS due to a low rate of missing authentic SNPs. The combination of assembler Trinity, SNP caller GATK, and the paired-end read length 150 bp had the best performance in SNP discovery with 100% accuracy both in peach and in mandarin cases. This strategy was applied to the characterization of SNPs in peach and mandarin transcriptomes.

CONCLUSIONS

Through comparison of authentic SNPs obtained by PCR cloning strategy and putative SNPs predicted from different combinations of five assemblers, two SNP callers, and two paired-end read lengths, we provided a reliable and efficient strategy, Trinity-GATK with 150 bp paired-end read length, for SNP discovery from RNA-seq data. This strategy discovered SNP at 100% accuracy in peach and mandarin cases and might be applicable to a wide range of plants and other organisms.

摘要

背景

单核苷酸多态性(SNP)已被用作遗传学和育种研究中的重要分子标记。下一代测序(NGS)的快速发展提供了一种高通量的 SNP 发现方法。然而,SNP 的开发受到可靠的 SNP 发现方法的限制。特别是,用于从下一代测序数据中准确预测 SNP 的最佳组装程序和 SNP 调用程序尚不清楚。

结果

在此,我们通过比较两种不同的双端读长(125bp 和 150bp)、五种组装程序(Trinity、IDBA、oases、SOAPdenovo、Trans-abyss)和两种 SNP 调用程序(GATK 和 GBS),在桃和柑橘皮组织的 RNA-seq 数据上进行了 SNP 预测。将预测的 SNP 与通过 PCR 扩增、基因克隆和测序程序鉴定的真实 SNP 进行比较。在五个花青素生物合成相关基因和九个类胡萝卜素生物合成基因中,共鉴定出 40 个和 240 个真实 SNP。使用不同策略从相同的 RNA-seq 数据中预测的假定 SNP 导致了截然不同的结果。与 125bp 相比,150bp 双端读长的假阳性 SNP 率显著降低。Trinity 优于其他四个组装程序,GATK 由于真实 SNP 缺失率低,明显优于 GBS。组装程序 Trinity、SNP 调用程序 GATK 和 150bp 双端读长的组合在 SNP 发现方面表现最佳,在桃和柑橘的情况下准确率均为 100%。该策略应用于桃和柑橘转录组中 SNP 的表征。

结论

通过比较通过 PCR 克隆策略获得的真实 SNP 和从五种组装程序、两种 SNP 调用程序和两种双端读长的不同组合中预测的假定 SNP,我们提供了一种可靠且高效的策略,即使用 Trinity-GATK 和 150bp 双端读长进行 SNP 发现。该策略在桃和柑橘的情况下 SNP 发现准确率为 100%,可能适用于广泛的植物和其他生物。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b50/6391812/1b28b9e4f5eb/12864_2019_5533_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b50/6391812/5e0dd58c7473/12864_2019_5533_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b50/6391812/1b28b9e4f5eb/12864_2019_5533_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b50/6391812/5e0dd58c7473/12864_2019_5533_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b50/6391812/1b28b9e4f5eb/12864_2019_5533_Fig2_HTML.jpg

相似文献

1
A high-throughput SNP discovery strategy for RNA-seq data.一种用于 RNA-seq 数据的高通量 SNP 发现策略。
BMC Genomics. 2019 Feb 27;20(1):160. doi: 10.1186/s12864-019-5533-4.
2
Single nucleotide polymorphism discovery in bovine liver using RNA-seq technology.利用RNA测序技术在牛肝脏中发现单核苷酸多态性
PLoS One. 2017 Feb 24;12(2):e0172687. doi: 10.1371/journal.pone.0172687. eCollection 2017.
3
Single Nucleotide Polymorphism Discovery in Bovine Pituitary Gland Using RNA-Seq Technology.利用RNA测序技术发现牛垂体中的单核苷酸多态性
PLoS One. 2016 Sep 8;11(9):e0161370. doi: 10.1371/journal.pone.0161370. eCollection 2016.
4
Gene-based SNP identification and validation in soybean using next-generation transcriptome sequencing.利用下一代转录组测序技术在大豆中进行基于基因的 SNP 鉴定和验证。
Mol Genet Genomics. 2018 Jun;293(3):623-633. doi: 10.1007/s00438-017-1410-5. Epub 2017 Dec 27.
5
An investigation of causes of false positive single nucleotide polymorphisms using simulated reads from a small eukaryote genome.利用来自小型真核生物基因组的模拟读数对单核苷酸多态性假阳性原因的调查。
BMC Bioinformatics. 2015 Nov 11;16:382. doi: 10.1186/s12859-015-0801-z.
6
Comparison of De Novo Transcriptome Assemblers and k-mer Strategies Using the Killifish, Fundulus heteroclitus.使用底鳉(Fundulus heteroclitus)对从头转录组组装工具和k-mer策略进行比较
PLoS One. 2016 Apr 7;11(4):e0153104. doi: 10.1371/journal.pone.0153104. eCollection 2016.
7
Comprehensive evaluation of de novo transcriptome assembly programs and their effects on differential gene expression analysis.从头转录组组装程序的综合评估及其对差异基因表达分析的影响。
Bioinformatics. 2017 Feb 1;33(3):327-333. doi: 10.1093/bioinformatics/btw625.
8
The impact of read length on quantification of differentially expressed genes and splice junction detection.读长对差异表达基因定量和剪接位点检测的影响。
Genome Biol. 2015 Jun 23;16(1):131. doi: 10.1186/s13059-015-0697-y.
9
Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study.优化从头转录组组装从短读 RNA-Seq 数据:一项比较研究。
BMC Bioinformatics. 2011 Dec 14;12 Suppl 14(Suppl 14):S2. doi: 10.1186/1471-2105-12-S14-S2.
10
Development and evaluation of a 9K SNP array for peach by internationally coordinated SNP detection and validation in breeding germplasm.通过在育种种质中进行国际协调的 SNP 检测和验证,开发和评估用于桃的 9K SNP 芯片。
PLoS One. 2012;7(4):e35668. doi: 10.1371/journal.pone.0035668. Epub 2012 Apr 20.

引用本文的文献

1
Identification of Genetic Relationships and Group Structure Analysis of Yanqi Horses.焉耆马的遗传关系鉴定与群体结构分析
Genes (Basel). 2025 Feb 27;16(3):294. doi: 10.3390/genes16030294.
2
Expression quantitative trait loci associated with performance traits, blood biochemical parameters, and cytokine profile in pigs.与猪的生产性能、血液生化参数和细胞因子谱相关的表达数量性状位点。
Front Genet. 2025 Mar 5;16:1533424. doi: 10.3389/fgene.2025.1533424. eCollection 2025.
3
Investigating the functional and structural effect of non-synonymous single nucleotide polymorphisms in the cytotoxic T-lymphocyte antigen-4 gene: An in-silico study.

本文引用的文献

1
Genomics of the origin and evolution of Citrus.柑橘的起源和进化的基因组学研究。
Nature. 2018 Feb 15;554(7692):311-316. doi: 10.1038/nature25447. Epub 2018 Feb 7.
2
Trends in plant research using molecular markers.利用分子标记进行植物研究的趋势。
Planta. 2018 Mar;247(3):543-557. doi: 10.1007/s00425-017-2829-y. Epub 2017 Dec 14.
3
Differential Sensitivity of Fruit Pigmentation to Ultraviolet Light between Two Peach Cultivars.两个桃品种果实色素沉着对紫外线的差异敏感性
研究细胞毒性T淋巴细胞抗原4基因非同义单核苷酸多态性的功能和结构效应:一项计算机模拟研究。
PLoS One. 2025 Jan 24;20(1):e0316465. doi: 10.1371/journal.pone.0316465. eCollection 2025.
4
Integrating dynamic high-throughput phenotyping and genetic analysis to monitor growth variation in foxtail millet.整合动态高通量表型分析与遗传分析以监测谷子生长变异
Plant Methods. 2024 Nov 5;20(1):168. doi: 10.1186/s13007-024-01295-z.
5
Molecular targets and strategies in the development of nucleic acid cancer vaccines: from shared to personalized antigens.核酸癌症疫苗研发中的分子靶点和策略:从共享抗原到个体化抗原。
J Biomed Sci. 2024 Oct 9;31(1):94. doi: 10.1186/s12929-024-01082-x.
6
Identification of Single-Nucleotide Polymorphisms in Differentially Expressed Genes Favoring Soybean Meal Tolerance in Higher-Growth Zebrafish (Danio rerio).鉴定在高生长斑马鱼(Danio rerio)中有利于豆粕耐受性的差异表达基因中的单核苷酸多态性。
Mar Biotechnol (NY). 2024 Aug;26(4):754-765. doi: 10.1007/s10126-024-10343-7. Epub 2024 Jul 3.
7
Assessing Privacy Vulnerabilities in Genetic Data Sets: Scoping Review.评估基因数据集的隐私漏洞:范围综述
JMIR Bioinform Biotechnol. 2024 May 27;5:e54332. doi: 10.2196/54332.
8
Transcriptome variations in hybrids of wild emmer wheat (Triticum turgidum ssp. dicoccoides).野生二粒小麦(Triticum turgidum ssp. dicoccoides)杂种的转录组变异。
BMC Plant Biol. 2024 Jun 18;24(1):571. doi: 10.1186/s12870-024-05258-3.
9
In silico functional, structural and pathogenicity analysis of missense single nucleotide polymorphisms in human MCM6 gene.在人 MCM6 基因中单核苷酸错义多态性的计算机功能、结构和致病性分析。
Sci Rep. 2024 May 21;14(1):11607. doi: 10.1038/s41598-024-62299-2.
10
Verification of Key Target Molecules for Intramuscular Fat Deposition and Screening of SNP Sites in Sheep from Small-Tail Han Sheep Breed and Its Cross with Suffolk.小尾寒羊及其与萨福克杂交后代肌肉脂肪沉积关键靶分子验证及SNP位点筛选
Int J Mol Sci. 2024 Mar 3;25(5):2951. doi: 10.3390/ijms25052951.
Front Plant Sci. 2017 Sep 8;8:1552. doi: 10.3389/fpls.2017.01552. eCollection 2017.
4
Optimizing Hybrid de Novo Transcriptome Assembly and Extending Genomic Resources for Giant Freshwater Prawns (Macrobrachium rosenbergii): The Identification of Genes and Markers Associated with Reproduction.优化罗氏沼虾(Macrobrachium rosenbergii)的混合从头转录组组装并扩展基因组资源:与繁殖相关的基因和标记的鉴定
Int J Mol Sci. 2016 May 7;17(5):690. doi: 10.3390/ijms17050690.
5
GBS-SNP-CROP: a reference-optional pipeline for SNP discovery and plant germplasm characterization using variable length, paired-end genotyping-by-sequencing data.GBS-SNP-CROP:一种用于单核苷酸多态性(SNP)发现和植物种质特征分析的无参考序列流程,使用可变长度的双端测序基因分型数据。
BMC Bioinformatics. 2016 Jan 12;17:29. doi: 10.1186/s12859-016-0879-y.
6
Comparison among three variant callers and assessment of the accuracy of imputation from SNP array data to whole-genome sequence level in chicken.鸡中三种变异检测工具的比较以及从SNP芯片数据到全基因组序列水平的填充准确性评估。
BMC Genomics. 2015 Oct 21;16:824. doi: 10.1186/s12864-015-2059-2.
7
Distinct Carotenoid and Flavonoid Accumulation in a Spontaneous Mutant of Ponkan (Citrus reticulata Blanco) Results in Yellowish Fruit and Enhanced Postharvest Resistance.甜橙(Citrus reticulata Blanco)自发突变体中类胡萝卜素和类黄酮的积累导致果实呈黄色,并增强了采后抗性。
J Agric Food Chem. 2015 Sep 30;63(38):8601-14. doi: 10.1021/acs.jafc.5b02807. Epub 2015 Sep 21.
8
The impact of read length on quantification of differentially expressed genes and splice junction detection.读长对差异表达基因定量和剪接位点检测的影响。
Genome Biol. 2015 Jun 23;16(1):131. doi: 10.1186/s13059-015-0697-y.
9
OTG-snpcaller: an optimized pipeline based on TMAP and GATK for SNP calling from ion torrent data.OTG-snpcaller:一种基于TMAP和GATK的优化流程,用于从离子激流数据中进行单核苷酸多态性(SNP)检测
PLoS One. 2014 May 13;9(5):e97507. doi: 10.1371/journal.pone.0097507. eCollection 2014.
10
The impacts of read length and transcriptome complexity for de novo assembly: a simulation study.从头组装中读取长度和转录组复杂性的影响:一项模拟研究。
PLoS One. 2014 Apr 15;9(4):e94825. doi: 10.1371/journal.pone.0094825. eCollection 2014.