Genome Immunobiology RIKEN Hakubi Research Team, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.
Methods Mol Biol. 2022;2509:353-360. doi: 10.1007/978-1-0716-2380-0_21.
Transposable elements (TEs) are a major source of PIWI-interacting RNAs (piRNAs), therefore properly assigning piRNA library sequencing reads to the TEs from which they were derived is important for accurate assessment of piRNA biology. When calculating the abundance of small RNA-seq reads mapping to various TEs, a non-overlapping TE annotation is preferable because reads mapping to more than one genomic feature will often be excluded when counting reads. However, most unmodified TE annotations contain some degree of overlap between TE features. Here, I outline the principle and provide all scripts needed to resolve such overlapping regions of TE annotations to a single best TE annotation leveraging a computationally efficient tree algorithm. Non-overlapping annotations generated by this method can be directly used in commonly used read counting software.
转座元件 (TEs) 是 PIWI 相互作用 RNA (piRNAs) 的主要来源,因此,正确地将 piRNA 文库测序reads 分配到它们衍生的 TEs 中,对于准确评估 piRNA 生物学非常重要。在计算映射到各种 TEs 的小 RNA-seq reads 的丰度时,最好使用不重叠的 TE 注释,因为在计数reads 时,通常会排除映射到多个基因组特征的reads。然而,大多数未经修饰的 TE 注释在 TE 特征之间存在一定程度的重叠。在这里,我概述了利用计算效率高的树算法将 TE 注释的重叠区域解析为单个最佳 TE 注释的原理,并提供了所有必要的脚本。通过这种方法生成的不重叠注释可以直接用于常用的 read 计数软件。