Suppr超能文献

利用染色质特征预测哺乳动物基因组中的调控元件。

Prediction of regulatory elements in mammalian genomes using chromatin signatures.

作者信息

Won Kyoung-Jae, Chepelev Iouri, Ren Bing, Wang Wei

机构信息

Dept of Chemistry & Biochemistry, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0359, USA.

出版信息

BMC Bioinformatics. 2008 Dec 18;9:547. doi: 10.1186/1471-2105-9-547.

Abstract

BACKGROUND

Recent genomic scale survey of epigenetic states in the mammalian genomes has shown that promoters and enhancers are correlated with distinct chromatin signatures, providing a pragmatic way for systematic mapping of these regulatory elements in the genome. With rapid accumulation of chromatin modification profiles in the genome of various organisms and cell types, this chromatin based approach promises to uncover many new regulatory elements, but computational methods to effectively extract information from these datasets are still limited.

RESULTS

We present here a supervised learning method to predict promoters and enhancers based on their unique chromatin modification signatures. We trained Hidden Markov models (HMMs) on the histone modification data for known promoters and enhancers, and then used the trained HMMs to identify promoter or enhancer like sequences in the human genome. Using a simulated annealing (SA) procedure, we searched for the most informative combination and the optimal window size of histone marks.

CONCLUSION

Compared with the previous methods, the HMM method can capture the complex patterns of histone modifications particularly from the weak signals. Cross validation and scanning the ENCODE regions showed that our method outperforms the previous profile-based method in mapping promoters and enhancers. We also showed that including more histone marks can further boost the performance of our method. This observation suggests that the HMM is robust and is capable of integrating information from multiple histone marks. To further demonstrate the usefulness of our method, we applied it to analyzing genome wide ChIP-Seq data in three mouse cell lines and correctly predicted active and inactive promoters with positive predictive values of more than 80%. The software is available at http://http:/nash.ucsd.edu/chromatin.tar.gz.

摘要

背景

近期对哺乳动物基因组表观遗传状态的全基因组规模调查表明,启动子和增强子与不同的染色质特征相关,这为在基因组中系统定位这些调控元件提供了一种实用方法。随着各种生物体和细胞类型基因组中染色质修饰图谱的快速积累,这种基于染色质的方法有望发现许多新的调控元件,但从这些数据集中有效提取信息的计算方法仍然有限。

结果

我们在此提出一种基于独特染色质修饰特征预测启动子和增强子的监督学习方法。我们在已知启动子和增强子的组蛋白修饰数据上训练隐马尔可夫模型(HMM),然后使用训练好的HMM在人类基因组中识别类似启动子或增强子的序列。通过模拟退火(SA)程序,我们搜索了组蛋白标记的最具信息性的组合和最佳窗口大小。

结论

与先前方法相比,HMM方法能够捕捉组蛋白修饰的复杂模式,尤其是来自微弱信号的模式。交叉验证和对ENCODE区域的扫描表明,我们的方法在定位启动子和增强子方面优于先前基于图谱的方法。我们还表明,纳入更多组蛋白标记可以进一步提高我们方法的性能。这一观察结果表明HMM是稳健的,并且能够整合来自多个组蛋白标记的信息。为了进一步证明我们方法的实用性,我们将其应用于分析三种小鼠细胞系的全基因组ChIP-Seq数据,并正确预测了活性和非活性启动子,阳性预测值超过80%。该软件可在http://http:/nash.ucsd.edu/chromatin.tar.gz获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2cd0/2657164/110783835020/1471-2105-9-547-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验