Suppr超能文献

T3E:一种利用染色质免疫沉淀测序(ChIP-seq)数据来表征转座元件表观遗传图谱的工具。

T3E: a tool for characterising the epigenetic profile of transposable elements using ChIP-seq data.

作者信息

Almeida da Paz Michelle, Taher Leila

机构信息

Institute of Biomedical Informatics, Graz University of Technology, Graz, Austria.

出版信息

Mob DNA. 2022 Nov 30;13(1):29. doi: 10.1186/s13100-022-00285-z.

Abstract

BACKGROUND

Despite the advent of Chromatin Immunoprecipitation Sequencing (ChIP-seq) having revolutionised our understanding of the mammalian genome's regulatory landscape, many challenges remain. In particular, because of their repetitive nature, the sequencing reads derived from transposable elements (TEs) pose a real bioinformatics challenge, to the point that standard analysis pipelines typically ignore reads whose genomic origin cannot be unambiguously ascertained.

RESULTS

We show that discarding ambiguously mapping reads may lead to a systematic underestimation of the number of reads associated with young TE families/subfamilies. We also provide evidence suggesting that the strategy of randomly permuting the location of the read mappings (or the TEs) that is often used to compute the background for enrichment calculations at TE families/subfamilies can result in both false positive and negative enrichments. To address these problems, we present the Transposable Element Enrichment Estimator (T3E), a tool that makes use of ChIP-seq data to characterise the epigenetic profile of associated TE families/subfamilies. T3E weights the number of read mappings assigned to the individual TE copies of a family/subfamily by the overall number of genomic loci to which the corresponding reads map, and this is done at the single nucleotide level. In addition, T3E computes ChIP-seq enrichment relative to a background estimated based on the distribution of the read mappings in the input control DNA. We demonstrated the capabilities of T3E on 23 different ChIP-seq libraries. T3E identified enrichments that were consistent with previous studies. Furthermore, T3E detected context-specific enrichments that are likely to pinpoint unexplored TE families/subfamilies with individual TE copies that have been frequently exapted as cis-regulatory elements during the evolution of mammalian regulatory networks.

CONCLUSIONS

T3E is a novel open-source computational tool (available for use at: https://github.com/michelleapaz/T3E ) that overcomes some of the pitfalls associated with the analysis of ChIP-seq data arising from the repetitive mammalian genome and provides a framework to shed light on the epigenetics of entire TE families/subfamilies.

摘要

背景

尽管染色质免疫沉淀测序(ChIP-seq)技术的出现彻底改变了我们对哺乳动物基因组调控格局的理解,但仍存在许多挑战。特别是,由于其重复性质,来自转座元件(TEs)的测序读数带来了真正的生物信息学挑战,以至于标准分析流程通常会忽略那些基因组来源无法明确确定的读数。

结果

我们表明,丢弃映射不明确的读数可能会导致与年轻TE家族/亚家族相关的读数数量被系统性低估。我们还提供了证据表明,常用于计算TE家族/亚家族富集背景的随机置换读数映射(或TEs)位置的策略可能会导致假阳性和假阴性富集。为了解决这些问题,我们提出了转座元件富集估计器(T3E),这是一种利用ChIP-seq数据来表征相关TE家族/亚家族表观遗传特征的工具。T3E通过相应读数映射到的基因组位点总数对分配给家族/亚家族单个TE拷贝的读数映射数量进行加权,并且这是在单核苷酸水平上完成的。此外,T3E相对于基于输入对照DNA中读数映射分布估计的背景计算ChIP-seq富集。我们在23个不同的ChIP-seq文库上展示了T3E的能力。T3E识别出与先前研究一致的富集。此外,T3E检测到特定背景下的富集,这些富集可能会确定未被探索的TE家族/亚家族,其单个TE拷贝在哺乳动物调控网络进化过程中经常被用作顺式调控元件。

结论

T3E是一种新颖的开源计算工具(可在https://github.com/michelleapaz/T3E使用),它克服了与分析来自重复哺乳动物基因组的ChIP-seq数据相关的一些缺陷,并提供了一个框架来阐明整个TE家族/亚家族的表观遗传学。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6090/9710123/962f33e2d702/13100_2022_285_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验