Moser Carlee, Gupta Mayetri
Boston University.
Stat Appl Genet Mol Biol. 2012 Jan 6;11(2):/j/sagmb.2012.11.issue-2/1544-6115.1707/1544-6115.1707.xml. doi: 10.2202/1544-6115.1707.
Chromatin structure, in terms of positioning of nucleosomes and nucleosome-free regions in the DNA, has been found to have an immense impact on various cell functions and processes, ranging from transcriptional regulation to growth and development. In spite of numerous experimental and computational approaches being developed in the past few years to determine the intrinsic relationship between chromatin structure (nucleosome positioning) and DNA sequence features, there is yet no universally accurate approach to predict nucleosome positioning from the underlying DNA sequence alone. We here propose an alternative approach to predicting nucleosome positioning from sequence, making use of characteristic sequence differences, and inherent dependencies in overlapping sequence features. Our nucleosomal positioning prediction algorithm, based on the idea of generalized hierarchical hidden Markov models (HGHMMs), was used to predict nucleosomal state based on the DNA sequence in yeast chromosome III, and compared with two other existing methods. The HGHMM method performed favorably among the three models in terms of specificity and sensitivity, and provided estimates that were largely consistent with predictions from the method of Yuan and Liu (2008). However, all the methods still give higher than desirable misclassification rates, indicating that sequence-based features may provide only limited information towards understanding positioning of nucleosomes. The method is implemented in the open-source statistical software R, and is freely available from the authors' website.
就核小体在DNA中的定位以及无核小体区域而言,染色质结构已被发现对各种细胞功能和过程有着巨大影响,从转录调控到生长发育皆是如此。尽管在过去几年中已开发出众多实验和计算方法来确定染色质结构(核小体定位)与DNA序列特征之间的内在关系,但目前仍没有一种普遍准确的方法能够仅从潜在的DNA序列预测核小体定位。我们在此提出一种从序列预测核小体定位的替代方法,利用特征序列差异以及重叠序列特征中的内在依赖性。我们基于广义分层隐马尔可夫模型(HGHMMs)概念的核小体定位预测算法,用于根据酵母三号染色体中的DNA序列预测核小体状态,并与其他两种现有方法进行比较。在特异性和敏感性方面,HGHMM方法在这三种模型中表现良好,并且提供的估计结果与Yuan和Liu(2008年)方法的预测结果基本一致。然而,所有这些方法的错误分类率仍然高于预期,这表明基于序列的特征在理解核小体定位方面可能仅提供有限的信息。该方法在开源统计软件R中实现,可从作者网站免费获取。