Lin Yunzhi, Ye Chen, Li Xingzhu, Chen Qinyao, Wu Ying, Zhang Feng, Pan Rui, Zhang Sijia, Chen Shuxia, Wang Xu, Cao Shuo, Wang Yingzhen, Yue Yi, Liu Yongsheng, Yue Junyang
College of Life Science, Sichuan University, Chengdu, Sichuan 610064, China.
School of Horticulture, Anhui Agricultural University, Hefei, Anhui 230036, China.
Hortic Res. 2023 Jun 13;10(8):uhad127. doi: 10.1093/hr/uhad127. eCollection 2023 Aug.
A high-quality genome is the basis for studies on functional, evolutionary, and comparative genomics. The majority of attention has been paid to the solution of complex chromosome structures and highly repetitive sequences, along with the emergence of a new 'telomere-to-telomere (T2T) assembly' era. However, the bioinformatic tools for the automatic construction and/or characterization of T2T genome are limited. Here, we developed a user-friendly web toolkit, quarTeT, which currently includes four modules: AssemblyMapper, GapFiller, TeloExplorer, and CentroMiner. First, AssemblyMapper is designed to assemble phased contigs into the chromosome-level genome by referring to a closely related genome. Then, GapFiller would endeavor to fill all unclosed gaps in a given genome with the aid of additional ultra-long sequences. Finally, TeloExplorer and CentroMiner are applied to identify candidate telomere and centromere as well as their localizations on each chromosome. These four modules can be used alone or in combination with each other for T2T genome assembly and characterization. As a case study, by adopting the entire modular functions of quarTeT, we have achieved the genome assembly that is of a quality comparable to the reported genome Hongyang v4.0, which was assembled with the addition of manual handling. Further evaluation of CentroMiner by searching centromeres in and genomes showed that quarTeT is capable of identifying all the centromeric regions that have been previously detected by experimental methods. Collectively, quarTeT is an efficient toolkit for studies of large-scale T2T genomes and can be accessed at http://www.atcgn.com:8080/quarTeT/home.html without registration.
高质量的基因组是功能基因组学、进化基因组学和比较基因组学研究的基础。随着新的“端粒到端粒(T2T)组装”时代的出现,大部分注意力都集中在复杂染色体结构和高度重复序列的解决方案上。然而,用于自动构建和/或表征T2T基因组的生物信息学工具有限。在这里,我们开发了一个用户友好的网络工具包quarTeT,它目前包括四个模块:AssemblyMapper、GapFiller、TeloExplorer和CentroMiner。首先,AssemblyMapper旨在通过参考密切相关的基因组将分阶段的重叠群组装成染色体水平的基因组。然后,GapFiller将借助额外的超长序列努力填补给定基因组中所有未封闭的间隙。最后,TeloExplorer和CentroMiner用于识别候选端粒和着丝粒以及它们在每条染色体上的定位。这四个模块可以单独使用或相互组合用于T2T基因组组装和表征。作为一个案例研究,通过采用quarTeT的全部模块化功能,我们实现了与已报道的通过人工处理组装的宏阳v4.0基因组质量相当的基因组组装。通过在 和 基因组中搜索着丝粒对CentroMiner进行的进一步评估表明,quarTeT能够识别所有先前通过实验方法检测到的着丝粒区域。总的来说,quarTeT是一个用于大规模T2T基因组研究的高效工具包,可在http://www.atcgn.com:8080/quarTeT/home.html免费访问,无需注册。