原始转录组学数据到基因特异性 SSRs：生物学家验证的免费生物信息学工作流程。

Raw transcriptomics data to gene specific SSRs: a validated free bioinformatics workflow for biologists.

机构信息

Agricultural Biotechnology Centre, Faculty of Agriculture, University of Peradeniya, Peradeniya, 20400, Sri Lanka.

Postgraduate Institute of Science, University of Peradeniya, Peradeniya, 20400, Sri Lanka.

出版信息

Sci Rep. 2020 Oct 26;10(1):18236. doi: 10.1038/s41598-020-75270-8.

DOI:10.1038/s41598-020-75270-8

PMID:33106560

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7588437/

Abstract

Recent advances in next-generation sequencing technologies have paved the path for a considerable amount of sequencing data at a relatively low cost. This has revolutionized the genomics and transcriptomics studies. However, different challenges are now created in handling such data with available bioinformatics platforms both in assembly and downstream analysis performed in order to infer correct biological meaning. Though there are a handful of commercial software and tools for some of the procedures, cost of such tools has made them prohibitive for most research laboratories. While individual open-source or free software tools are available for most of the bioinformatics applications, those components usually operate standalone and are not combined for a user-friendly workflow. Therefore, beginners in bioinformatics might find analysis procedures starting from raw sequence data too complicated and time-consuming with the associated learning-curve. Here, we outline a procedure for de novo transcriptome assembly and Simple Sequence Repeats (SSR) primer design solely based on tools that are available online for free use. For validation of the developed workflow, we used Illumina HiSeq reads of different tissue samples of Santalum album (sandalwood), generated from a previous transcriptomics project. A portion of the designed primers were tested in the lab with relevant samples and all of them successfully amplified the targeted regions. The presented bioinformatics workflow can accurately assemble quality transcriptomes and develop gene specific SSRs. Beginner biologists and researchers in bioinformatics can easily utilize this workflow for research purposes.

摘要

近年来，下一代测序技术的进步为相对较低的成本获得大量测序数据铺平了道路。这彻底改变了基因组学和转录组学研究。然而，现在在可用的生物信息学平台上处理这些数据时，无论是组装还是下游分析，都面临着不同的挑战，以便推断出正确的生物学意义。尽管有一些商业软件和工具可用于某些程序，但这些工具的成本使得它们对大多数研究实验室来说都是望尘莫及的。虽然大多数生物信息学应用都有一些开源或免费的软件工具，但这些组件通常是独立运行的，并没有为用户友好的工作流程进行组合。因此，生物信息学的初学者可能会发现，从原始序列数据开始的分析过程过于复杂和耗时，并且学习曲线也很陡峭。在这里，我们概述了一种仅基于免费在线工具的从头转录组组装和简单重复序列（SSR）引物设计的程序。为了验证开发的工作流程，我们使用了之前转录组学项目中不同组织样本的 Illumina HiSeq 读取数据。设计的一部分引物在实验室中用相关样本进行了测试，所有引物都成功地扩增了目标区域。所提出的生物信息学工作流程可以准确地组装高质量的转录组并开发基因特异性 SSR。生物信息学的初学者和研究人员可以轻松地将此工作流程用于研究目的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19af/7588437/2edf58cb43de/41598_2020_75270_Fig1_HTML.jpg

相似文献

Raw transcriptomics data to gene specific SSRs: a validated free bioinformatics workflow for biologists.原始转录组学数据到基因特异性 SSRs：生物学家验证的免费生物信息学工作流程。

Sci Rep. 2020 Oct 26;10(1):18236. doi: 10.1038/s41598-020-75270-8.

ESAP plus: a web-based server for EST-SSR marker development.ESAP plus：一个用于EST-SSR标记开发的基于网络的服务器。

BMC Genomics. 2016 Dec 22;17(Suppl 13):1035. doi: 10.1186/s12864-016-3328-4.

PeanutDB: an integrated bioinformatics web portal for Arachis hypogaea transcriptomics.花生数据库：一个用于花生转录组学的综合生物信息学网络平台。

BMC Plant Biol. 2012 Jun 19;12:94. doi: 10.1186/1471-2229-12-94.

Using R and Bioconductor in Clinical Genomics and Transcriptomics.使用 R 和 Bioconductor 进行临床基因组学和转录组学研究。

J Mol Diagn. 2020 Jan;22(1):3-20. doi: 10.1016/j.jmoldx.2019.08.006. Epub 2019 Oct 9.

SPARTA: Simple Program for Automated reference-based bacterial RNA-seq Transcriptome Analysis.SPARTA：用于基于参考的细菌RNA测序转录组自动分析的简单程序。

BMC Bioinformatics. 2016 Feb 4;17:66. doi: 10.1186/s12859-016-0923-y.

Bioinformatics: identification of markers from next-generation sequence data.生物信息学：从下一代测序数据中识别标记物。

Methods Mol Biol. 2015;1245:29-47. doi: 10.1007/978-1-4939-1966-6_3.

Using KBase to Assemble and Annotate Prokaryotic Genomes.使用KBase组装和注释原核生物基因组。

Curr Protoc Microbiol. 2017 Aug 11;46:1E.13.1-1E.13.18. doi: 10.1002/cpmc.37.

Bioinformatics Analysis of Small RNA Transcriptomes: The Detailed Workflow.小RNA转录组的生物信息学分析：详细工作流程

Methods Mol Biol. 2017;1456:197-224. doi: 10.1007/978-1-4899-7708-3_16.

De novo assembly of the pepper transcriptome (Capsicum annuum): a benchmark for in silico discovery of SNPs, SSRs and candidate genes.从头组装辣椒转录组（Capsicum annuum）：用于 SNP、SSR 和候选基因在计算机上发现的基准。

BMC Genomics. 2012 Oct 30;13:571. doi: 10.1186/1471-2164-13-571.

Re-assembly, quality evaluation, and annotation of 678 microbial eukaryotic reference transcriptomes.重新组装、质量评估和 678 个微生物真核参考转录组的注释。

Gigascience. 2019 Apr 1;8(4). doi: 10.1093/gigascience/giy158.

引用本文的文献

Multiple configurations of the plastid and mitochondrial genomes of Caragana spinosa.羊柴质体和线粒体基因组的多种构型。

Planta. 2023 Oct 13;258(5):98. doi: 10.1007/s00425-023-04245-6.

Chloroplast genome, nuclear ITS regions, mitogenome regions, and Skmer analysis resolved the genetic relationship among Cinnamomum species in Sri Lanka.叶绿体基因组、核 ITS 区、线粒体基因组区域和 Skmer 分析解决了斯里兰卡肉桂属物种之间的遗传关系。

PLoS One. 2023 Sep 20;18(9):e0291763. doi: 10.1371/journal.pone.0291763. eCollection 2023.

Assembly of the Complete Mitochondrial Genome of Revealed That Two Pairs of Repetitive Elements Mediated the Recombination of the Genome.组装揭示了基因组重组是由两对重复元件介导的。

Int J Mol Sci. 2023 May 6;24(9):8366. doi: 10.3390/ijms24098366.

Characterisation of the complete mitochondrial genome of Taraxacum mongolicum revealed five repeat-mediated recombinations.蒙古蒲公英完整线粒体基因组的特征分析揭示了五次重复介导的重组。

Plant Cell Rep. 2023 Apr;42(4):775-789. doi: 10.1007/s00299-023-02994-y. Epub 2023 Feb 12.

The complete plastomes of red fleshed pitaya () and three related species: insights into gene losses, inverted repeat expansions and phylogenomic implications.红肉火龙果（）及其三个近缘物种的完整质体基因组：对基因丢失、反向重复序列扩增及系统发育学意义的洞察

Physiol Mol Biol Plants. 2022 Jan;28(1):123-137. doi: 10.1007/s12298-021-01121-z. Epub 2022 Jan 11.

Assembly of the complete mitochondrial genome of an endemic plant, Scutellaria tsinyunensis, revealed the existence of two conformations generated by a repeat-mediated recombination.缙云黄芩这一特有植物完整线粒体基因组的组装揭示了由重复介导的重组产生的两种构象的存在。

Planta. 2021 Jul 24;254(2):36. doi: 10.1007/s00425-021-03684-3.

本文引用的文献

Development of 19 novel microsatellite markers of lily-of-the-valley (Convallaria, Asparagaceae) from transcriptome sequencing.百合属（天门冬科）的 19 个新微卫星标记的转录组测序开发。

Mol Biol Rep. 2020 Apr;47(4):3041-3047. doi: 10.1007/s11033-020-05376-9. Epub 2020 Mar 19.

De novo assembly of the cattle reference genome with single-molecule sequencing.利用单分子测序技术从头组装牛参考基因组。

Gigascience. 2020 Mar 1;9(3). doi: 10.1093/gigascience/giaa021.

De novo assembly of a wild pear (Pyrus betuleafolia) genome.野生梨（Pyrus betuleafolia）基因组从头组装。

Plant Biotechnol J. 2020 Feb;18(2):581-595. doi: 10.1111/pbi.13226. Epub 2019 Aug 12.

Nanopore sequencing: Review of potential applications in functional genomics.纳米孔测序：在功能基因组学中的潜在应用综述。

Dev Growth Differ. 2019 Jun;61(5):316-326. doi: 10.1111/dgd.12608. Epub 2019 Apr 29.

Analysis of error profiles in deep next-generation sequencing data.深度下一代测序数据中的错误分析。

Genome Biol. 2019 Mar 14;20(1):50. doi: 10.1186/s13059-019-1659-6.

TransFlow: a modular framework for assembling and assessing accurate de novo transcriptomes in non-model organisms.TransFlow：一种用于在非模式生物中组装和评估准确从头转录组的模块化框架。

BMC Bioinformatics. 2018 Nov 20;19(Suppl 14):416. doi: 10.1186/s12859-018-2384-y.

SSRome: an integrated database and pipelines for exploring microsatellites in all organisms.SSRome：一个整合的数据库和分析流程，用于探索所有生物中的微卫星。

Nucleic Acids Res. 2019 Jan 8;47(D1):D244-D252. doi: 10.1093/nar/gky998.

In silico read normalization using set multi-cover optimization.基于集合多重覆盖优化的计算读归一化。

Bioinformatics. 2018 Oct 1;34(19):3273-3280. doi: 10.1093/bioinformatics/bty307.

Cloud computing for genomic data analysis and collaboration.云计算在基因组数据分析和协作中的应用。

Nat Rev Genet. 2018 Apr;19(4):208-219. doi: 10.1038/nrg.2017.113. Epub 2018 Jan 30.

Plastome Sequencing of Ten Nonmodel Crop Species Uncovers a Large Insertion of Mitochondrial DNA in Cashew.对 10 种非模式作物的质体基因组测序揭示腰果中线粒体 DNA 的大片段插入。

Plant Genome. 2017 Nov;10(3). doi: 10.3835/plantgenome2017.03.0020.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

原始转录组学数据到基因特异性 SSRs：生物学家验证的免费生物信息学工作流程。

Raw transcriptomics data to gene specific SSRs: a validated free bioinformatics workflow for biologists.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献