Martin Darren P, Varsani Arvind, Roumagnac Philippe, Botha Gerrit, Maslamoney Suresh, Schwab Tiana, Kelz Zena, Kumar Venkatesh, Murrell Ben
Department of Integrative Biomedical Sciences, Computational Biology Group, Institute of Infectious Disease and Molecular Medicine, University of Cape Town, Anzio Road Observatory, Cape Town 7549, South Africa.
The Biodesign Center for Fundamental and Applied Microbiomics, Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ 85287-5001, USA.
Virus Evol. 2020 Apr 12;7(1):veaa087. doi: 10.1093/ve/veaa087. eCollection 2021 Jan.
For the past 20 years, the recombination detection program (RDP) project has focused on the development of a fast, flexible, and easy to use Windows-based recombination analysis tool. Whereas previous versions of this tool have relied on considerable user-mediated verification of detected recombination events, the latest iteration, RDP5, is automated enough that it can be integrated within analysis pipelines and run without any user input. The main innovation enabling this degree of automation is the implementation of statistical tests to identify recombination signals that could be attributable to evolutionary processes other than recombination. The additional analysis time required for these tests has been offset by algorithmic improvements throughout the program such that, relative to RDP4, RDP5 will still run up to five times faster and be capable of analyzing alignments containing twice as many sequences (up to 5000) that are five times longer (up to 50 million sites). For users wanting to remove signals of recombination from their datasets before using them for downstream phylogenetics-based molecular evolution analyses, RDP5 can disassemble detected recombinant sequences into their constituent parts and output a variety of different recombination-free datasets in an array of different alignment formats. For users that are interested in exploring the recombination history of their datasets, all the manual verification, data management and data visualization components of RDP5 have been extensively updated to minimize the amount of time needed by users to individually verify and refine the program's interpretation of each of the individual recombination events that it detects.
在过去20年里,重组检测程序(RDP)项目一直专注于开发一款快速、灵活且易于使用的基于Windows的重组分析工具。该工具的早期版本依赖用户大量参与检测到的重组事件的验证工作,而其最新版本RDP5实现了高度自动化,能够集成到分析流程中,无需任何用户输入即可运行。实现这种自动化程度的主要创新在于实施了统计测试,以识别可能归因于除重组之外的进化过程的重组信号。这些测试所需的额外分析时间已被整个程序的算法改进所抵消,因此相对于RDP4,RDP5的运行速度仍能快达五倍,并且能够分析序列数量多达两倍(最多5000个)、长度多达五倍(最多5000万个位点)的比对。对于那些希望在将数据集用于基于系统发育的下游分子进化分析之前去除重组信号的用户,RDP5可以将检测到的重组序列拆解为其组成部分,并以一系列不同的比对格式输出各种不同的无重组数据集。对于那些有兴趣探索其数据集重组历史的用户,RDP5的所有手动验证、数据管理和数据可视化组件都已得到广泛更新,以尽量减少用户单独验证和完善程序对其检测到的每个重组事件的解释所需的时间。