Ge Ruiquan, Mai Guoqin, Zhang Ruochi, Wu Xundong, Wu Qing, Zhou Fengfeng
.
J Integr Bioinform. 2017 Aug 10;14(3):20170029. doi: 10.1515/jib-2017-0029.
Background Miniature inverted repeat transposable element (MITE) is a short transposable element, carrying no protein-coding regions. However, its high proliferation rate and sequence-specific insertion preference renders it as a good genetic tool for both natural evolution and experimental insertion mutagenesis. Recently active MITE copies are those with clear signals of Terminal Inverted Repeats (TIRs) and Direct Repeats (DRs), and are recently translocated into their current sites. Their proliferation ability renders them good candidates for the investigation of genomic evolution. Results This study optimizes the C++ code and running pipeline of the MITE Uncovering SysTem (MUST) by assuming no prior knowledge of MITEs required from the users, and the current version, MUSTv2, shows significantly increased detection accuracy for recently active MITEs, compared with similar programs. The running speed is also significantly increased compared with MUSTv1. We prepared a benchmark dataset, the simulated genome with 150 MITE copies for researchers who may be of interest. Conclusions MUSTv2 represents an accurate detection program of recently active MITE copies, which is complementary to the existing template-based MITE mapping programs. We believe that the release of MUSTv2 will greatly facilitate the genome annotation and structural analysis of the bioOMIC big data researchers.
背景 微型反向重复转座元件(MITE)是一种短转座元件,不携带蛋白质编码区。然而,其高增殖率和序列特异性插入偏好使其成为自然进化和实验性插入诱变的良好遗传工具。最近活跃的MITE拷贝是那些具有明显末端反向重复(TIR)和正向重复(DR)信号的拷贝,并且是最近转移到其当前位置的。它们的增殖能力使其成为基因组进化研究的良好候选对象。结果 本研究通过假设用户无需预先了解MITE,优化了MITE发现系统(MUST)中的C++代码和运行流程,当前版本MUSTv2与类似程序相比,对最近活跃MITE的检测准确性显著提高。与MUSTv1相比,运行速度也显著提高。我们为感兴趣的研究人员准备了一个基准数据集,即包含150个MITE拷贝的模拟基因组。结论 MUSTv2是一个检测最近活跃MITE拷贝的准确程序,它与现有的基于模板的MITE映射程序互补。我们相信MUSTv2的发布将极大地促进生物组学大数据研究人员的基因组注释和结构分析。