Goubert Clement, Craig Rory J, Bilat Agustin F, Peona Valentina, Vogan Aaron A, Protasio Anna V
Canadian Center for Computational Genomics, McGill University, Montreal, Québec, Canada.
Department of Human Genetics, McGill University, Montreal, Québec, Canada.
Mob DNA. 2022 Mar 30;13(1):7. doi: 10.1186/s13100-021-00259-7.
In the study of transposable elements (TEs), the generation of a high confidence set of consensus sequences that represent the diversity of TEs found in a given genome is a key step in the path to investigate these fascinating genomic elements. Many algorithms and pipelines are available to automatically identify putative TE families present in a genome. Despite the availability of these valuable resources, producing a library of high-quality full-length TE consensus sequences largely remains a process of manual curation. This know-how is often passed on from mentor-to-mentee within research groups, making it difficult for those outside the field to access this highly specialised skill.
Our manuscript attempts to fill this gap by providing a set of detailed computer protocols, software recommendations and video tutorials for those aiming to manually curate TEs. Detailed step-by-step protocols, aimed at the complete beginner, are presented in the Supplementary Methods.
The proposed set of programs and tools presented here will make the process of manual curation achievable and amenable to all researchers and in special to those new to the field of TEs.
在转座元件(TEs)的研究中,生成一组高可信度的共有序列以代表给定基因组中发现的TEs多样性,是研究这些迷人基因组元件过程中的关键一步。有许多算法和流程可用于自动识别基因组中存在的推定TE家族。尽管有这些宝贵资源,但生成高质量全长TE共有序列库在很大程度上仍然是一个手动整理的过程。这种专业知识通常在研究小组内由导师传授给学员,使得该领域之外的人难以掌握这种高度专业化的技能。
我们的论文试图通过为那些旨在手动整理TEs的人提供一套详细的计算机协议、软件推荐和视频教程来填补这一空白。针对完全初学者的详细分步协议在补充方法中给出。
这里提出的一组程序和工具将使手动整理过程对所有研究人员,特别是对TEs领域的新手来说切实可行。