Suppr超能文献

一个组蛋白精氨酸甲基化定位在人类基因组中卫星 II 和 III DNA 序列的核小体上。

A histone arginine methylation localizes to nucleosomes in satellite II and III DNA sequences in the human genome.

机构信息

Department of Bioengineering and Therapeutic Sciences, San Francisco, CA, USA.

出版信息

BMC Genomics. 2012 Nov 15;13:630. doi: 10.1186/1471-2164-13-630.

Abstract

BACKGROUND

Applying supervised learning/classification techniques to epigenomic data may reveal properties that differentiate histone modifications. Previous analyses sought to classify nucleosomes containing histone H2A/H4 arginine 3 symmetric dimethylation (H2A/H4R3me2s) or H2A.Z using human CD4+ T-cell chromatin immunoprecipitation sequencing (ChIP-Seq) data. However, these efforts only achieved modest accuracy with limited biological interpretation. Here, we investigate the impact of using appropriate data pre-processing -deduplication, normalization, and position- (peak-) finding to identify stable nucleosome positions - in conjunction with advanced classification algorithms, notably discriminatory motif feature selection and random forests. Performance assessments are based on accuracy and interpretative yield.

RESULTS

We achieved dramatically improved accuracy using histone modification features (99.0%; previous attempts, 68.3%) and DNA sequence features (94.1%; previous attempts, <60%). Furthermore, the algorithms elicited interpretable features that withstand permutation testing, including: the histone modifications H4K20me3 and H3K9me3, which are components of heterochromatin; and the motif TCCATT, which is part of the consensus sequence of satellite II and III DNA. Downstream analysis demonstrates that satellite II and III DNA in the human genome is occupied by stable nucleosomes containing H2A/H4R3me2s, H4K20me3, and/or H3K9me3, but not 18 other histone methylations. These results are consistent with the recent biochemical finding that H4R3me2s provides a binding site for the DNA methyltransferase (Dnmt3a) that methylates satellite II and III DNA.

CONCLUSIONS

Classification algorithms applied to appropriately pre-processed ChIP-Seq data can accurately discriminate between histone modifications. Algorithms that facilitate interpretation, such as discriminatory motif feature selection, have the added potential to impart information about underlying biological mechanism.

摘要

背景

将监督学习/分类技术应用于表观基因组数据可能会揭示区分组蛋白修饰的特性。先前的分析试图使用人类 CD4+T 细胞染色质免疫沉淀测序(ChIP-Seq)数据对含有组蛋白 H2A/H4 精氨酸 3 对称二甲基化(H2A/H4R3me2s)或 H2A.Z 的核小体进行分类。然而,这些努力仅实现了有限的生物学解释的适度准确性。在这里,我们研究了使用适当的数据预处理(去重、归一化和位置-(峰-)发现以识别稳定的核小体位置)结合先进的分类算法(特别是判别基序特征选择和随机森林)的影响。性能评估基于准确性和解释性产量。

结果

我们使用组蛋白修饰特征(99.0%;以前的尝试,68.3%)和 DNA 序列特征(94.1%;以前的尝试,<60%)显著提高了准确性。此外,算法得出了可解释的特征,这些特征经得起置换测试,包括:组蛋白修饰 H4K20me3 和 H3K9me3,它们是异染色质的组成部分;以及 motif TCCATT,它是卫星 II 和 III DNA 保守序列的一部分。下游分析表明,人类基因组中的卫星 II 和 III DNA 由含有 H2A/H4R3me2s、H4K20me3 和/或 H3K9me3 的稳定核小体占据,但不包含其他 18 种组蛋白甲基化。这些结果与最近的生化发现一致,即 H4R3me2s 为 DNA 甲基转移酶(Dnmt3a)提供了结合位点,后者甲基化卫星 II 和 III DNA。

结论

应用于适当预处理的 ChIP-Seq 数据的分类算法可以准确区分组蛋白修饰。有助于解释的算法,如判别基序特征选择,具有赋予潜在生物学机制信息的附加潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02b9/3559892/eaf17663780a/1471-2164-13-630-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验