Suppr超能文献

GeCKO:使用目标富集捕获对复杂基因组进行基因分型的用户友好型工作流程。以大型四倍体硬粒小麦基因组为例。

GeCKO: user-friendly workflows for genotyping complex genomes using target enrichment capture. A use case on the large tetraploid durum wheat genome.

作者信息

Ardisson Morgane, Girodolle Johanna, De Mita Stéphane, Roumet Pierre, Ranwez Vincent

机构信息

UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, 34398, Montpellier, France.

INRAE, CIRAD, Institut Agro, IRD, PHIM, Université Montpellier, Montpellier, France.

出版信息

Plant Methods. 2024 Jul 13;20(1):103. doi: 10.1186/s13007-024-01210-6.

Abstract

BACKGROUND

Genotyping of individuals plays a pivotal role in various biological analyses, with technology choice influenced by multiple factors including genomic constraints, number of targeted loci and individuals, cost considerations, and the ease of sample preparation and data processing. Target enrichment capture of specific polymorphic regions has emerged as a flexible and cost-effective genomic reduction method for genotyping, especially adapted to the case of very large genomes. However, this approach necessitates complex bioinformatics treatment to extract genotyping data from raw reads. Existing workflows predominantly cater to phylogenetic inference, leaving a gap in user-friendly tools for genotyping analysis based on capture methods. In response to these challenges, we have developed GeCKO (Genotyping Complexity Knocked-Out). To assess the effectiveness of combining target enrichment capture with GeCKO, we conducted a case study on durum wheat domestication history, involving sequencing, processing, and analyzing variants in four relevant durum wheat groups.

RESULTS

GeCKO encompasses four distinct workflows, each designed for specific steps of genomic data processing: (i) read demultiplexing and trimming for data cleaning, (ii) read mapping to align sequences to a reference genome, (iii) variant calling to identify genetic variants, and (iv) variant filtering. Each workflow in GeCKO can be easily configured and is executable across diverse computational environments. The workflows generate comprehensive HTML reports including key summary statistics and illustrative graphs, ensuring traceable, reproducible results and facilitating straightforward quality assessment. A specific innovation within GeCKO is its 'targeted remapping' feature, specifically designed for efficient treatment of targeted enrichment capture data. This process consists of extracting reads mapped to the targeted regions, constructing a smaller sub-reference genome, and remapping the reads to this sub-reference, thereby enhancing the efficiency of subsequent steps.

CONCLUSIONS

The case study results showed the expected intra-group diversity and inter-group differentiation levels, confirming the method's effectiveness for genotyping and analyzing genetic diversity in species with complex genomes. GeCKO streamlined the data processing, significantly improving computational performance and efficiency. The targeted remapping enabled straightforward SNP calling in durum wheat, a task otherwise complicated by the species' large genome size. This illustrates its potential applications in various biological research contexts.

摘要

背景

个体基因分型在各种生物学分析中起着关键作用,技术选择受多种因素影响,包括基因组限制、目标位点和个体数量、成本考量以及样本制备和数据处理的难易程度。特定多态性区域的靶向富集捕获已成为一种灵活且经济高效的基因分型基因组缩减方法,特别适用于非常大的基因组情况。然而,这种方法需要复杂的生物信息学处理才能从原始读数中提取基因分型数据。现有的工作流程主要用于系统发育推断,在基于捕获方法的用户友好型基因分型分析工具方面存在空白。为应对这些挑战,我们开发了GeCKO(基因分型复杂性消除)。为评估将靶向富集捕获与GeCKO相结合的有效性,我们对硬粒小麦驯化历史进行了案例研究,涉及对四个相关硬粒小麦组的测序、处理和变异分析。

结果

GeCKO包含四个不同的工作流程,每个流程针对基因组数据处理的特定步骤设计:(i)用于数据清理的读取解复用和修剪,(ii)将读取映射到参考基因组以对齐序列,(iii)变异调用以识别遗传变异,以及(iv)变异过滤。GeCKO中的每个工作流程都可以轻松配置,并且可在各种计算环境中执行。这些工作流程生成包含关键汇总统计信息和说明性图表的全面HTML报告,确保结果可追溯、可重复,并便于直接进行质量评估。GeCKO中的一项特定创新是其“靶向重映射”功能,专门为有效处理靶向富集捕获数据而设计。此过程包括提取映射到目标区域的读取,构建较小的子参考基因组,并将读取重新映射到该子参考基因组,从而提高后续步骤的效率。

结论

案例研究结果显示了预期的组内多样性和组间分化水平,证实了该方法在具有复杂基因组的物种中进行基因分型和分析遗传多样性的有效性。GeCKO简化了数据处理,显著提高了计算性能和效率。靶向重映射使得在硬粒小麦中能够直接进行单核苷酸多态性(SNP)调用,而这一任务在该物种基因组庞大的情况下原本会很复杂。这说明了它在各种生物学研究背景下的潜在应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f63/11246579/7fe6e37b191b/13007_2024_1210_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验