Suppr超能文献

转座元件亚家族注释存在可重复性问题。

Transposable element subfamily annotation has a reproducibility problem.

作者信息

Carey Kaitlin M, Patterson Gilia, Wheeler Travis J

机构信息

Department of Computer Science, University of Montana, 32 Campus Drive, Missoula, MT, USA.

Institute of Ecology and Evolution, University of Oregon, 272 Onyx Bridge, Eugene, OR, USA.

出版信息

Mob DNA. 2021 Jan 23;12(1):4. doi: 10.1186/s13100-021-00232-4.

Abstract

BACKGROUND

Transposable element (TE) sequences are classified into families based on the reconstructed history of replication, and into subfamilies based on more fine-grained features that are often intended to capture family history. We evaluate the reliability of annotation with common subfamilies by assessing the extent to which subfamily annotation is reproducible in replicate copies created by segmental duplications in the human genome, and in homologous copies shared by human and chimpanzee.

RESULTS

We find that standard methods annotate over 10% of replicates as belonging to different subfamilies, despite the fact that they are expected to be annotated as belonging to the same subfamily. Point mutations and homologous recombination appear to be responsible for some of this discordant annotation (particularly in the young Alu family), but are unlikely to fully explain the annotation unreliability.

CONCLUSIONS

The surprisingly high level of disagreement in subfamily annotation of homologous sequences highlights a need for further research into definition of TE subfamilies, methods for representing subfamily annotation confidence of TE instances, and approaches to better utilizing such nuanced annotation data in downstream analysis.

摘要

背景

转座元件(TE)序列根据重建的复制历史被分类为家族,并根据通常旨在捕捉家族历史的更精细特征被分类为亚家族。我们通过评估亚家族注释在人类基因组中由片段重复产生的重复拷贝以及人类和黑猩猩共享的同源拷贝中可重复的程度,来评估常见亚家族注释的可靠性。

结果

我们发现,标准方法将超过10%的重复序列注释为属于不同的亚家族,尽管它们预期应被注释为属于同一个亚家族。点突变和同源重组似乎是造成部分这种不一致注释的原因(特别是在年轻的Alu家族中),但不太可能完全解释注释的不可靠性。

结论

同源序列亚家族注释中惊人的高度不一致凸显了对转座元件亚家族定义、表示转座元件实例亚家族注释置信度的方法以及在下游分析中更好利用此类细微注释数据的方法进行进一步研究的必要性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验