Gil Juanita, Andrade-Martínez Juan Sebastian, Duitama Jorge
Systems and Computing Engineering Department, Universidad de Los Andes, Bogotá, Colombia.
Research Group on Computational Biology and Microbial Ecology, Department of Biological Sciences, Universidad de Los Andes, Bogotá, Colombia.
Front Genet. 2021 Feb 3;12:624513. doi: 10.3389/fgene.2021.624513. eCollection 2021.
TILLING (Targeting Induced Local Lesions IN Genomes) is a powerful reverse genetics method in plant functional genomics and breeding to identify mutagenized individuals with improved behavior for a trait of interest. Pooled high throughput sequencing (HTS) of the targeted genes allows efficient identification and sample assignment of variants within genes of interest in hundreds of individuals. Although TILLING has been used successfully in different crops and even applied to natural populations, one of the main issues for a successful TILLING experiment is that most currently available bioinformatics tools for variant detection are not designed to identify mutations with low frequencies in pooled samples or to perform sample identification from variants identified in overlapping pools. Our research group maintains the Next Generation Sequencing Experience Platform (NGSEP), an open source solution for analysis of HTS data. In this manuscript, we present three novel components within NGSEP to facilitate the design and analysis of TILLING experiments: a pooled variants detector, a sample identifier from variants detected in overlapping pools and a simulator of TILLING experiments. A new implementation of the NGSEP calling model for variant detection allows accurate detection of low frequency mutations within pools. The samples identifier implements the process to triangulate the mutations called within overlapping pools in order to assign mutations to single individuals whenever possible. Finally, we developed a complete simulator of TILLING experiments to enable benchmarking of different tools and to facilitate the design of experimental alternatives varying the number of pools and individuals per pool. Simulation experiments based on genes from the common bean genome indicate that NGSEP provides similar accuracy and better efficiency than other tools to perform pooled variants detection. To the best of our knowledge, NGSEP is currently the only tool that generates individual assignments of the mutations discovered from the pooled data. We expect that this development will be of great use for different groups implementing TILLING as an alternative for plant breeding and even to research groups performing pooled sequencing for other applications.
定向诱导基因组局部突变(TILLING)是植物功能基因组学和育种中一种强大的反向遗传学方法,用于鉴定具有目标性状改良表现的诱变个体。对目标基因进行高通量测序(HTS)可以高效识别和分配数百个个体中感兴趣基因内的变异。尽管TILLING已在不同作物中成功应用,甚至应用于自然群体,但成功进行TILLING实验的主要问题之一是,目前大多数用于变异检测的生物信息学工具并非设计用于识别混合样本中低频突变,也无法从重叠混合样本中识别的变异进行样本鉴定。我们的研究小组维护着下一代测序经验平台(NGSEP),这是一个用于分析HTS数据的开源解决方案。在本论文中,我们展示了NGSEP中的三个新组件,以促进TILLING实验的设计和分析:一个混合变异检测器、一个从重叠混合样本中检测到的变异进行样本识别的工具以及一个TILLING实验模拟器。NGSEP变异检测调用模型的新实现允许准确检测混合样本中的低频突变。样本识别工具实现了对重叠混合样本中检测到的突变进行三角定位的过程,以便尽可能将突变分配到单个个体。最后,我们开发了一个完整的TILLING实验模拟器,以实现对不同工具的基准测试,并促进设计不同的实验方案,改变混合样本数量和每个混合样本中的个体数量。基于菜豆基因组基因的模拟实验表明,NGSEP在进行混合变异检测时,与其他工具相比具有相似的准确性和更高的效率。据我们所知,NGSEP是目前唯一能从混合数据中发现的突变进行个体分配的工具。我们预计这一进展将对不同实施TILLING作为植物育种替代方法的团队非常有用,甚至对为其他应用进行混合测序的研究团队也很有用。