Suppr超能文献

新获取的目标:提高从被子植物353探针组中恢复基因座的能力。

New targets acquired: Improving locus recovery from the Angiosperms353 probe set.

作者信息

McLay Todd G B, Birch Joanne L, Gunn Bee F, Ning Weixuan, Tate Jennifer A, Nauheimer Lars, Joyce Elizabeth M, Simpson Lalita, Schmidt-Lebuhn Alexander N, Baker William J, Forest Félix, Jackson Chris J

机构信息

National Herbarium of Victoria Royal Botanic Gardens Victoria Melbourne Australia.

School of Biosciences University of Melbourne Melbourne Australia.

出版信息

Appl Plant Sci. 2021 Jun 14;9(7). doi: 10.1002/aps3.11420. eCollection 2021 Jul.

Abstract

PREMISE

Universal target enrichment kits maximize utility across wide evolutionary breadth while minimizing the number of baits required to create a cost-efficient kit. The Angiosperms353 kit has been successfully used to capture loci throughout the angiosperms, but the default target reference file includes sequence information from only 6-18 taxa per locus. Consequently, reads sequenced from on-target DNA molecules may fail to map to references, resulting in fewer on-target reads for assembly, and reducing locus recovery.

METHODS

We expanded the Angiosperms353 target file, incorporating sequences from 566 transcriptomes to produce a 'mega353' target file, with each locus represented by 17-373 taxa. This mega353 file is a drop-in replacement for the original Angiosperms353 file in HybPiper analyses. We provide tools to subsample the file based on user-selected taxon groups, and to incorporate other transcriptome or protein-coding gene data sets.

RESULTS

Compared to the default Angiosperms353 file, the mega353 file increased the percentage of on-target reads by an average of 32%, increased locus recovery at 75% length by 49%, and increased the total length of the concatenated loci by 29%.

DISCUSSION

Increasing the phylogenetic density of the target reference file results in improved recovery of target capture loci. The mega353 file and associated scripts are available at: https://github.com/chrisjackson-pellicle/NewTargets.

摘要

前提

通用目标富集试剂盒在广泛的进化广度上最大限度地提高了实用性,同时将创建具有成本效益的试剂盒所需的诱饵数量降至最低。被子植物353试剂盒已成功用于捕获整个被子植物的基因座,但默认的目标参考文件每个基因座仅包含6 - 18个分类群的序列信息。因此,从靶向DNA分子测序得到的 reads 可能无法映射到参考序列,导致用于组装的靶向 reads 减少,进而降低基因座回收率。

方法

我们扩展了被子植物353目标文件,纳入了566个转录组的序列,以生成一个“mega353”目标文件,每个基因座由17 - 373个分类群表示。这个 mega353 文件可直接替代HybPiper分析中原来的被子植物353文件。我们提供了工具,可根据用户选择的分类群进行文件抽样,并纳入其他转录组或蛋白质编码基因数据集。

结果

与默认的被子植物353文件相比,mega353文件使靶向 reads 的百分比平均提高了32%,75%长度的基因座回收率提高了49%,串联基因座的总长度增加了29%。

讨论

增加目标参考文件的系统发育密度可提高目标捕获基因座的回收率。mega353文件及相关脚本可在以下网址获取:https://github.com/chrisjackson-pellicle/NewTargets。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/137a/8312740/9d572de6eb36/APS3-9--g003.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验