Department of Evolution, Ecology and Behavior, University of Illinois, Urbana, Illinois, USA.
Mol Ecol Resour. 2021 Feb;21(2):363-378. doi: 10.1111/1755-0998.13163. Epub 2020 May 20.
Restriction-site associated DNA sequencing (RADseq) has become a powerful and versatile tool in modern population genomics, enabling large-scale evolutionary and genomic analyses in otherwise inaccessible biological systems. With its widespread use, different variants on the protocol have been developed to suit specific experimental needs. Researchers face the challenge of choosing the optimal molecular and sequencing protocols for their reduced representation experimental design, an often-complicated process. Strategic errors can lead to biased data generation that has reduced power to answer biological questions. Here, we present RADinitio, simulation software for the selection and optimization of RADseq experiments via the generation of sequencing data that behave similarly to empirical sources. RADinitio provides an evolutionary simulation of populations, implementation of various RADseq protocols with customizable parameters, and thorough assessment of missing data. We test the efficacy of the software using different RAD protocols across several organisms, highlighting the importance of protocol selection on the magnitude and quality of data acquired. Additionally, we test the effects of RAD library preparation and sequencing on allelic dropout, observing that library preparation and sequencing often contributes more to missing alleles than population-level variation.
限制性位点相关 DNA 测序 (RADseq) 已成为现代群体基因组学中一种强大且多功能的工具,能够对其他难以进入的生物系统进行大规模的进化和基因组分析。随着其广泛应用,已经开发出不同版本的协议来满足特定的实验需求。研究人员面临着为其简化代表性实验设计选择最佳分子和测序协议的挑战,这是一个经常很复杂的过程。策略性错误可能导致数据生成存在偏差,从而降低回答生物学问题的能力。在这里,我们介绍了 RADinitio,这是一款通过生成类似于经验来源的测序数据来选择和优化 RADseq 实验的模拟软件。RADinitio 提供了种群的进化模拟、具有可自定义参数的各种 RADseq 协议的实现,以及对缺失数据的彻底评估。我们使用不同的 RAD 协议在几种生物体上测试了该软件的功效,强调了协议选择对获取数据的数量和质量的重要性。此外,我们还测试了 RAD 文库制备和测序对等位基因缺失的影响,观察到文库制备和测序通常比种群水平的变异对缺失等位基因的贡献更大。