使用GET_HOMOLOGUES-EST对植物泛基因组和转录组进行分析，这是一种针对同一物种序列的聚类解决方案。

Analysis of Plant Pan-Genomes and Transcriptomes with GET_HOMOLOGUES-EST, a Clustering Solution for Sequences of the Same Species.

作者信息

Contreras-Moreira Bruno, Cantalapiedra Carlos P, García-Pereira María J, Gordon Sean P, Vogel John P, Igartua Ernesto, Casas Ana M, Vinuesa Pablo

机构信息

Estación Experimental de Aula Dei - Consejo Superior de Investigaciones CientíficasZaragoza, Spain; Fundación ARAIDZaragoza, Spain.

Estación Experimental de Aula Dei - Consejo Superior de Investigaciones Científicas Zaragoza, Spain.

出版信息

Front Plant Sci. 2017 Feb 14;8:184. doi: 10.3389/fpls.2017.00184. eCollection 2017.

DOI:10.3389/fpls.2017.00184

PMID:28261241

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5306281/

Abstract

The pan-genome of a species is defined as the union of all the genes and non-coding sequences found in all its individuals. However, constructing a pan-genome for plants with large genomes is daunting both in sequencing cost and the scale of the required computational analysis. A more affordable alternative is to focus on the genic repertoire by using transcriptomic data. Here, the software GET_HOMOLOGUES-EST was benchmarked with genomic and RNA-seq data of 19 ecotypes and then applied to the analysis of transcripts from 16 genotypes. The goal was to sample their pan-genomes and classify sequences as core, if detected in all accessions, or accessory, when absent in some of them. The resulting sequence clusters were used to simulate pan-genome growth, and to compile Average Nucleotide Identity matrices that summarize intra-species variation. Although transcripts were found to under-estimate pan-genome size by at least 10%, we concluded that clusters of expressed sequences can recapitulate phylogeny and reproduce two properties observed in gene models: accessory loci show lower expression and higher non-synonymous substitution rates than core genes. Finally, accessory sequences were observed to preferentially encode transposon components in both species, plus disease resistance genes in cultivated barleys, and a variety of protein domains from other families that appear frequently associated with presence/absence variation in the literature. These results demonstrate that pan-genome analyses are useful to explore germplasm diversity.

摘要

一个物种的泛基因组被定义为在其所有个体中发现的所有基因和非编码序列的总和。然而，为具有大基因组的植物构建泛基因组在测序成本和所需计算分析的规模方面都是令人生畏的。一种更经济实惠的替代方法是通过使用转录组数据来关注基因库。在这里，软件GET_HOMOLOGUES-EST用19个生态型的基因组和RNA-seq数据进行了基准测试，然后应用于对16个基因型的转录本进行分析。目标是对它们的泛基因组进行采样，并将序列分类为核心序列（如果在所有种质中都能检测到）或辅助序列（如果在其中一些种质中不存在）。所得的序列簇用于模拟泛基因组的增长，并编制总结种内变异的平均核苷酸同一性矩阵。虽然发现转录本会使泛基因组大小至少低估10%，但我们得出结论，表达序列簇可以概括系统发育，并重现基因模型中观察到的两个特性：辅助基因座的表达低于核心基因，且非同义替换率高于核心基因。最后，观察到辅助序列在这两个物种中都优先编码转座子成分，在栽培大麦中还编码抗病基因，以及文献中经常与存在/缺失变异相关的其他家族的各种蛋白质结构域。这些结果表明，泛基因组分析对于探索种质多样性是有用的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b63/5306281/0382d886c304/fpls-08-00184-g001.jpg

相似文献

Analysis of Plant Pan-Genomes and Transcriptomes with GET_HOMOLOGUES-EST, a Clustering Solution for Sequences of the Same Species.

Front Plant Sci. 2017 Feb 14;8:184. doi: 10.3389/fpls.2017.00184. eCollection 2017.

Panoramic: A package for constructing eukaryotic pan-genomes.

Mol Ecol Resour. 2021 May;21(4):1393-1403. doi: 10.1111/1755-0998.13344. Epub 2021 Mar 12.

Pan-Chromosome and Comparative Analysis of Reveal Important Traits Concerning the Genetic Diversity, Evolutionary Dynamics, and Niche Adaptation of the Species.

Microbiol Spectr. 2023 Feb 28;11(2):e0292422. doi: 10.1128/spectrum.02924-22.

Mitochondrial genome sequences from wild and cultivated barley (Hordeum vulgare).

BMC Genomics. 2016 Oct 24;17(1):824. doi: 10.1186/s12864-016-3159-3.

GET_PHYLOMARKERS, a Software Package to Select Optimal Orthologous Clusters for Phylogenomics and Inferring Pan-Genome Phylogenies, Used for a Critical Geno-Taxonomic Revision of the Genus .

Front Microbiol. 2018 May 1;9:771. doi: 10.3389/fmicb.2018.00771. eCollection 2018.

PanTools: representation, storage and exploration of pan-genomic data.

Bioinformatics. 2016 Sep 1;32(17):i487-i493. doi: 10.1093/bioinformatics/btw455.

RPAN: rice pan-genome browser for ∼3000 rice genomes.

Nucleic Acids Res. 2017 Jan 25;45(2):597-605. doi: 10.1093/nar/gkw958. Epub 2016 Dec 10.

Inside the Pan-genome - Methods and Software Overview.

Curr Genomics. 2015 Aug;16(4):245-52. doi: 10.2174/1389202916666150423002311.

A pan-transcriptome analysis shows that disease resistance genes have undergone more selection pressure during barley domestication.

BMC Genomics. 2019 Jan 7;20(1):12. doi: 10.1186/s12864-018-5357-7.

Pan-Genome Storage and Analysis Techniques.

Methods Mol Biol. 2018;1704:29-53. doi: 10.1007/978-1-4939-7463-4_2.

引用本文的文献

Current and future pangenomic research in cucurbit crops.

Breed Sci. 2025 Mar;75(1):34-50. doi: 10.1270/jsbbs.24048. Epub 2025 Feb 26.

Comparative genomics profiling of Citrus species reveals the diversity and disease responsiveness of the GLP pangenes family.

BMC Plant Biol. 2025 Mar 27;25(1):388. doi: 10.1186/s12870-025-06397-x.

A stepwise guide for pangenome development in crop plants: an alfalfa (Medicago sativa) case study.

BMC Genomics. 2024 Oct 31;25(1):1022. doi: 10.1186/s12864-024-10931-w.

Analyzes of pan-genome and resequencing atlas unveil the genetic basis of jujube domestication.

Nat Commun. 2024 Oct 29;15(1):9320. doi: 10.1038/s41467-024-53718-z.

Exploring the genetic makeup of species causing bacterial spot in Taiwan: evidence of population shift and local adaptation.

Front Microbiol. 2024 May 23;15:1408885. doi: 10.3389/fmicb.2024.1408885. eCollection 2024.

Methods for Pangenomic Core Detection.

Methods Mol Biol. 2024;2802:73-106. doi: 10.1007/978-1-0716-3838-5_4.

Technological Development and Advances for Constructing and Analyzing Plant Pangenomes.

Genome Biol Evol. 2024 Apr 2;16(4). doi: 10.1093/gbe/evae081.

A comprehensive evaluation of the potential of three next-generation short-read-based plant pan-genome construction strategies for the identification of novel non-reference sequence.

Front Plant Sci. 2024 Mar 19;15:1371222. doi: 10.3389/fpls.2024.1371222. eCollection 2024.

GET_PANGENES: calling pangenes from plant genome alignments confirms presence-absence variation.

Genome Biol. 2023 Oct 5;24(1):223. doi: 10.1186/s13059-023-03071-z.

Evaluation of nuclear and mitochondrial phylogenetics for the subtyping of Cyclospora cayetanensis.

Parasitol Res. 2023 Nov;122(11):2641-2650. doi: 10.1007/s00436-023-07963-8. Epub 2023 Sep 7.

本文引用的文献

A Cluster of Nucleotide-Binding Site-Leucine-Rich Repeat Genes Resides in a Barley Powdery Mildew Resistance Quantitative Trait Loci on 7HL.

Plant Genome. 2016 Jul;9(2). doi: 10.3835/plantgenome2015.10.0101.

The pangenome of an agronomically important crop plant Brassica oleracea.

Nat Commun. 2016 Nov 11;7:13390. doi: 10.1038/ncomms13390.

Computational pan-genomics: status, promises and challenges.

Brief Bioinform. 2018 Jan 1;19(1):118-135. doi: 10.1093/bib/bbw089.

PanTools: representation, storage and exploration of pan-genomic data.

Bioinformatics. 2016 Sep 1;32(17):i487-i493. doi: 10.1093/bioinformatics/btw455.

RSAT::Plants: Motif Discovery Within Clusters of Upstream Sequences in Plant Genomes.

Methods Mol Biol. 2016;1482:279-95. doi: 10.1007/978-1-4939-6396-6_18.

Genome-wide association study using whole-genome sequencing rapidly identifies new genes influencing agronomic traits in rice.

Nat Genet. 2016 Aug;48(8):927-34. doi: 10.1038/ng.3596. Epub 2016 Jun 20.

RapMap: a rapid, sensitive and accurate tool for mapping RNA-seq reads to transcriptomes.

Bioinformatics. 2016 Jun 15;32(12):i192-i200. doi: 10.1093/bioinformatics/btw277.

1,135 Genomes Reveal the Global Pattern of Polymorphism in Arabidopsis thaliana.

Cell. 2016 Jul 14;166(2):481-491. doi: 10.1016/j.cell.2016.05.063. Epub 2016 Jun 9.

The Arabidopsis thaliana mobilome and its impact at the species level.

Elife. 2016 Jun 3;5:e15716. doi: 10.7554/eLife.15716.

Integrative approaches for large-scale transcriptome-wide association studies.

Nat Genet. 2016 Mar;48(3):245-52. doi: 10.1038/ng.3506. Epub 2016 Feb 8.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用GET_HOMOLOGUES-EST对植物泛基因组和转录组进行分析，这是一种针对同一物种序列的聚类解决方案。

Analysis of Plant Pan-Genomes and Transcriptomes with GET_HOMOLOGUES-EST, a Clustering Solution for Sequences of the Same Species.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献