Suppr超能文献

组合测序在非模式生物关联作图中的应用。

Utility of pooled sequencing for association mapping in nonmodel organisms.

机构信息

Columbia River Inter-Tribal Fish Commission, Hagerman Fish Culture Experiment Station, Hagerman, Idaho.

出版信息

Mol Ecol Resour. 2018 Jul;18(4):825-837. doi: 10.1111/1755-0998.12784. Epub 2018 Apr 25.

Abstract

High-density genome-wide sequencing increases the likelihood of discovering genes of major effect and genomic structural variation in organisms. While there is an increasing availability of reference genomes across broad taxa, the greatest limitation to whole-genome sequencing of multiple individuals continues to be the costs associated with sequencing. To alleviate excessive costs, pooling multiple individuals with similar phenotypes and sequencing the homogenized DNA (Pool-Seq) can achieve high genome coverage, but at the loss of individual genotypes. Although Pool-Seq has been an effective method for association mapping in model organisms, it has not been frequently utilized in natural populations. To extend bioinformatic tools for rapid implementation of Pool-Seq data in nonmodel organisms, we developed a pipeline called PoolParty and illustrate its effectiveness in genetic association mapping. Alignment expectations based on five pooled Chinook salmon (Oncorhynchus tshawytscha) libraries showed that approximately 48% genome coverage per library could be achieved with reasonable sequencing effort. We additionally examined male and female O. tshawytscha libraries to illustrate how Pool-Seq techniques can successfully map known genes associated with functional differences among sexes such as growth hormone 2. Finally, we compared pools of individuals of different spawning ages for each sex to discover novel genes involved with age at maturity in O. tshawytscha such as opsin4 and transmembrane protein19. While not appropriate for every system, Pool-Seq data processed by the PoolParty pipeline is a practical method for identifying genes of major effect in nonmodel organisms when high genome coverage is necessary and cost is a limiting factor.

摘要

高密度全基因组测序增加了在生物体中发现主要效应基因和基因组结构变异的可能性。虽然在广泛的分类群中越来越多地提供了参考基因组,但对多个个体进行全基因组测序的最大限制仍然是与测序相关的成本。为了减轻过高的成本,可以将具有相似表型的多个个体混合并对均质化的 DNA 进行测序(Pool-Seq),从而实现高基因组覆盖率,但会失去个体基因型。尽管 Pool-Seq 一直是模型生物关联作图的有效方法,但它在自然种群中并未得到广泛应用。为了在非模式生物中快速实施 Pool-Seq 数据的生物信息学工具,我们开发了一个名为 PoolParty 的管道,并说明了它在遗传关联作图中的有效性。基于五个 pooled Chinook 三文鱼(Oncorhynchus tshawytscha)文库的对齐预期表明,通过合理的测序努力,每个文库可以实现约 48%的基因组覆盖率。我们还检查了雄性和雌性 O. tshawytscha 文库,以说明 Pool-Seq 技术如何成功地映射与性别之间功能差异相关的已知基因,例如生长激素 2。最后,我们比较了每个性别的不同产卵年龄的个体的池,以发现与 O. tshawytscha 成熟年龄相关的新基因,例如 opsin4 和跨膜蛋白 19。虽然 Pool-Seq 数据不适合每个系统,但当需要高基因组覆盖率且成本是限制因素时,通过 PoolParty 管道处理的 Pool-Seq 数据是识别非模式生物中主要效应基因的实用方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验