Sexton Corinne E, Han Mira V
1School of Life Sciences, University of Nevada, Las Vegas, NV 89154 USA.
Nevada Institute of Personalized Medicine, Las Vegas, NV 89154 USA.
Mob DNA. 2019 Jul 10;10:29. doi: 10.1186/s13100-019-0172-5. eCollection 2019.
Though transposable elements make up around half of the human genome, the repetitive nature of their sequences makes it difficult to accurately align conventional sequencing reads. However, in light of new advances in sequencing technology, such as increased read length and paired-end libraries, these repetitive regions are now becoming easier to align to. This study investigates the mappability of transposable elements with 50 bp, 76 bp and 100 bp paired-end read libraries. With respect to those read lengths and allowing for 3 mismatches during alignment, over 68, 85, and 88% of all transposable elements in the RepeatMasker database are uniquely mappable, suggesting that accurate locus-specific mapping of older transposable elements is well within reach.
尽管转座元件约占人类基因组的一半,但其序列的重复性使得传统测序读数难以准确比对。然而,鉴于测序技术的新进展,如读长增加和双末端文库,这些重复区域现在变得更容易比对。本研究调查了转座元件在50bp、76bp和100bp双末端读数文库中的可映射性。对于这些读长,并在比对过程中允许3个错配,RepeatMasker数据库中超过68%、85%和88%的所有转座元件是唯一可映射的,这表明对较古老转座元件进行准确的位点特异性映射已触手可及。