利用二代测序（NGS）数据和半自动生物信息学方法改进香蕉“尖叶蕉（Musa acuminata）”参考序列

Improvement of the banana "Musa acuminata" reference sequence using NGS data and semi-automated bioinformatics methods.

作者信息

Martin Guillaume, Baurens Franc-Christophe, Droc Gaëtan, Rouard Mathieu, Cenci Alberto, Kilian Andrzej, Hastie Alex, Doležel Jaroslav, Aury Jean-Marc, Alberti Adriana, Carreel Françoise, D'Hont Angélique

机构信息

CIRAD (Centre de coopération Internationale en Recherche Agronomique pour le Développement), UMR AGAP, TA A-108/03, Avenue Agropolis, F-34398, Montpellier, cedex 5, France.

Bioversity International, Parc Scientifique Agropolis II, 34397, Montpellier, Cedex 5, France.

出版信息

BMC Genomics. 2016 Mar 16;17:243. doi: 10.1186/s12864-016-2579-4.

DOI:10.1186/s12864-016-2579-4

PMID:26984673

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4793746/

Abstract

BACKGROUND

Recent advances in genomics indicate functional significance of a majority of genome sequences and their long range interactions. As a detailed examination of genome organization and function requires very high quality genome sequence, the objective of this study was to improve reference genome assembly of banana (Musa acuminata).

RESULTS

We have developed a modular bioinformatics pipeline to improve genome sequence assemblies, which can handle various types of data. The pipeline comprises several semi-automated tools. However, unlike classical automated tools that are based on global parameters, the semi-automated tools proposed an expert mode for a user who can decide on suggested improvements through local compromises. The pipeline was used to improve the draft genome sequence of Musa acuminata. Genotyping by sequencing (GBS) of a segregating population and paired-end sequencing were used to detect and correct scaffold misassemblies. Long insert size paired-end reads identified scaffold junctions and fusions missed by automated assembly methods. GBS markers were used to anchor scaffolds to pseudo-molecules with a new bioinformatics approach that avoids the tedious step of marker ordering during genetic map construction. Furthermore, a genome map was constructed and used to assemble scaffolds into super scaffolds. Finally, a consensus gene annotation was projected on the new assembly from two pre-existing annotations. This approach reduced the total Musa scaffold number from 7513 to 1532 (i.e. by 80%), with an N50 that increased from 1.3 Mb (65 scaffolds) to 3.0 Mb (26 scaffolds). 89.5% of the assembly was anchored to the 11 Musa chromosomes compared to the previous 70%. Unknown sites (N) were reduced from 17.3 to 10.0%.

CONCLUSION

The release of the Musa acuminata reference genome version 2 provides a platform for detailed analysis of banana genome variation, function and evolution. Bioinformatics tools developed in this work can be used to improve genome sequence assemblies in other species.

摘要

背景

基因组学的最新进展表明大多数基因组序列及其长程相互作用具有功能意义。由于对基因组组织和功能的详细研究需要非常高质量的基因组序列，本研究的目的是改进香蕉（Musa acuminata）的参考基因组组装。

结果

我们开发了一种模块化生物信息学流程来改进基因组序列组装，该流程可以处理各种类型的数据。该流程包含几个半自动工具。然而，与基于全局参数的传统自动化工具不同，半自动工具为用户提供了一种专家模式，用户可以通过局部折中来决定建议的改进。该流程用于改进香蕉的基因组序列草图。通过对一个分离群体进行测序基因分型（GBS）和双末端测序来检测和纠正支架错误组装。长插入片段双末端读段识别出了自动化组装方法遗漏的支架连接和融合。GBS标记通过一种新的生物信息学方法用于将支架锚定到假分子上，该方法避免了遗传图谱构建过程中标记排序这一繁琐步骤。此外，构建了一个基因组图谱并用于将支架组装成超级支架。最后，从两个先前的注释中在新组装上预测了一个一致的基因注释。这种方法将香蕉的支架总数从7513个减少到1532个（即减少了80%），N50从1.3 Mb（65个支架）增加到3.0 Mb（26个支架）。与之前的70%相比，89.5%的组装序列被锚定到11条香蕉染色体上。未知位点（N）从17.3%减少到10.0%。

结论

香蕉（Musa acuminata）参考基因组版本2的发布为详细分析香蕉基因组变异、功能和进化提供了一个平台。本研究中开发的生物信息学工具可用于改进其他物种的基因组序列组装。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71a6/4793746/358dc82018f4/12864_2016_2579_Fig1_HTML.jpg

相似文献

Improvement of the banana "Musa acuminata" reference sequence using NGS data and semi-automated bioinformatics methods.

BMC Genomics. 2016 Mar 16;17:243. doi: 10.1186/s12864-016-2579-4.

The cacao Criollo genome v2.0: an improved version of the genome for genetic and functional genomic studies.

BMC Genomics. 2017 Sep 15;18(1):730. doi: 10.1186/s12864-017-4120-9.

An ultra-high density genetic linkage map of perennial ryegrass (Lolium perenne) using genotyping by sequencing (GBS) based on a reference shotgun genome assembly.

Ann Bot. 2016 Jul;118(1):71-87. doi: 10.1093/aob/mcw081. Epub 2016 Jun 6.

The phased telomere-to-telomere reference genome of Musa acuminata, a main contributor to banana cultivars.

Sci Data. 2023 Sep 16;10(1):631. doi: 10.1038/s41597-023-02546-9.

Repetitive part of the banana (Musa acuminata) genome investigated by low-depth 454 sequencing.

BMC Plant Biol. 2010 Sep 16;10:204. doi: 10.1186/1471-2229-10-204.

A BAC end view of the Musa acuminata genome.

BMC Plant Biol. 2007 Jun 11;7:29. doi: 10.1186/1471-2229-7-29.

"A draft Musa balbisiana genome sequence for molecular genetics in polyploid, inter- and intra-specific Musa hybrids".

BMC Genomics. 2013 Oct 5;14:683. doi: 10.1186/1471-2164-14-683.

Analysis of the leaf transcriptome of Musa acuminata during interaction with Mycosphaerella musicola: gene assembly, annotation and marker development.

BMC Genomics. 2013 Feb 5;14:78. doi: 10.1186/1471-2164-14-78.

Evolution of the Banana Genome (Musa acuminata) Is Impacted by Large Chromosomal Translocations.

Mol Biol Evol. 2017 Sep 1;34(9):2140-2152. doi: 10.1093/molbev/msx164.

Application of Population Sequencing (POPSEQ) for Ordering and Imputing Genotyping-by-Sequencing Markers in Hexaploid Wheat.

G3 (Bethesda). 2015 Nov 3;5(12):2547-53. doi: 10.1534/g3.115.020362.

引用本文的文献

A splendid banana enigma: Phylogenomic assessment of Vietnamese Musa splendida and Musa viridis populations shows that they are conspecific.

PLoS One. 2025 Feb 11;20(2):e0318252. doi: 10.1371/journal.pone.0318252. eCollection 2025.

Newly developed genomic SSR markers revealed the population structure and genetic characteristics of abaca ( Nee).

BioTechnologia (Pozn). 2024 Dec 19;105(4):337-353. doi: 10.5114/bta.2024.145255. eCollection 2024.

Comparative genetic mapping and a consensus interspecific genetic map reveal strong synteny and collinearity within the genus.

Front Plant Sci. 2024 Dec 16;15:1475965. doi: 10.3389/fpls.2024.1475965. eCollection 2024.

Genome-Wide Identification and Expression Analysis of Gene in Response to GA and SL Related to Plant Height in Banana.

Plants (Basel). 2024 Feb 5;13(3):458. doi: 10.3390/plants13030458.

Two haplotype-resolved genome assemblies for AAB allotriploid bananas provide insights into banana subgenome asymmetric evolution and Fusarium wilt control.

Plant Commun. 2024 Feb 12;5(2):100766. doi: 10.1016/j.xplc.2023.100766. Epub 2023 Nov 15.

Narrow genetic diversity in germplasm from the Guinean and Sudano-Guinean zones in Benin indicates the need to broaden the genetic base of sweet fig banana (Musa acuminata cv Sotoumon).

PLoS One. 2023 Nov 16;18(11):e0294315. doi: 10.1371/journal.pone.0294315. eCollection 2023.

Comparative RNA-seq analysis of resistant and susceptible banana genotypes reveals molecular mechanisms in response to banana bunchy top virus (BBTV) infection.

Sci Rep. 2023 Oct 31;13(1):18719. doi: 10.1038/s41598-023-45937-z.

Genetic Enhancement of Cereals Using Genomic Resources for Nutritional Food Security.

Genes (Basel). 2023 Sep 7;14(9):1770. doi: 10.3390/genes14091770.

Genome-wide identification and expression analysis of the family under low-temperature stress in bananas.

Front Plant Sci. 2023 Aug 30;14:1216070. doi: 10.3389/fpls.2023.1216070. eCollection 2023.

Telomere-to-telomere haplotype-resolved reference genome reveals subgenome divergence and disease resistance in triploid Cavendish banana.

Hortic Res. 2023 Aug 1;10(9):uhad153. doi: 10.1093/hr/uhad153. eCollection 2023 Sep.

本文引用的文献

Assembly and diploid architecture of an individual human genome via single-molecule technologies.

Nat Methods. 2015 Aug;12(8):780-6. doi: 10.1038/nmeth.3454. Epub 2015 Jun 29.

Misassembly detection using paired-end sequence reads and optical mapping data.

Bioinformatics. 2015 Jun 15;31(12):i80-8. doi: 10.1093/bioinformatics/btv262.

Genome-wide survey of the seagrass Zostera muelleri suggests modification of the ethylene signalling network.

J Exp Bot. 2015 Mar;66(5):1489-98. doi: 10.1093/jxb/eru510. Epub 2015 Jan 6.

Evolutionary divergence of β-expansin structure and function in grasses parallels emergence of distinctive primary cell wall traits.

Plant J. 2015 Jan;81(1):108-20. doi: 10.1111/tpj.12715. Epub 2014 Nov 27.

Genetic anchoring of whole-genome shotgun assemblies.

Front Genet. 2014 Jul 7;5:208. doi: 10.3389/fgene.2014.00208. eCollection 2014.

Tangled up in two: a burst of genome duplications at the end of the Cretaceous and the consequences for plant evolution.

Philos Trans R Soc Lond B Biol Sci. 2014 Aug 5;369(1648). doi: 10.1098/rstb.2013.0353.

SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information.

BMC Bioinformatics. 2014 Jun 20;15:211. doi: 10.1186/1471-2105-15-211.

Expansion of banana (Musa acuminata) gene families involved in ethylene biosynthesis and signalling after lineage-specific whole-genome duplications.

New Phytol. 2014 May;202(3):986-1000. doi: 10.1111/nph.12710. Epub 2014 Feb 7.

Plant genome sequencing - applications for crop improvement.

Curr Opin Biotechnol. 2014 Apr;26:31-7. doi: 10.1016/j.copbio.2013.08.019. Epub 2013 Sep 21.

Genomic analysis of NAC transcription factors in banana (Musa acuminata) and definition of NAC orthologous groups for monocots and dicots.

Plant Mol Biol. 2014 May;85(1-2):63-80. doi: 10.1007/s11103-013-0169-2. Epub 2014 Feb 26.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用二代测序（NGS）数据和半自动生物信息学方法改进香蕉“尖叶蕉（Musa acuminata）”参考序列

Improvement of the banana "Musa acuminata" reference sequence using NGS data and semi-automated bioinformatics methods.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献