Department of Statistics and Probability Theory, Vienna University of Technology, Vienna, Austria.
PLoS One. 2012;7(12):e51480. doi: 10.1371/journal.pone.0051480. Epub 2012 Dec 19.
MicroRNAs (miRs) are known to play an important role in mRNA regulation, often by binding to complementary sequences in "target" mRNAs. Recently, several methods have been developed by which existing sequence-based target predictions can be combined with miR and mRNA expression data to infer true miR-mRNA targeting relationships. It has been shown that the combination of these two approaches gives more reliable results than either by itself. While a few such algorithms give excellent results, none fully addresses expression data sets with a natural ordering of the samples. If the samples in an experiment can be ordered or partially ordered by their expected similarity to one another, such as for time-series or studies of development processes, stages, or types, (e.g. cell type, disease, growth, aging), there are unique opportunities to infer miR-mRNA interactions that may be specific to the underlying processes, and existing methods do not exploit this. We propose an algorithm which specifically addresses [partially] ordered expression data and takes advantage of sample similarities based on the ordering structure. This is done within a Bayesian framework which specifies posterior distributions and therefore statistical significance for each model parameter and latent variable. We apply our model to a previously published expression data set of paired miR and mRNA arrays in five partially ordered conditions, with biological replicates, related to multiple myeloma, and we show how considering potential orderings can improve the inference of miR-mRNA interactions, as measured by existing knowledge about the involved transcripts.
微 RNA(miRs)在 mRNA 调控中起着重要作用,通常通过与“靶”mRNA 中的互补序列结合来实现。最近,已经开发了几种方法,可以将基于现有序列的靶标预测与 miR 和 mRNA 表达数据相结合,从而推断出真正的 miR-mRNA 靶标关系。已经表明,这两种方法的结合比单独使用任何一种方法都能得到更可靠的结果。虽然有几个这样的算法可以给出很好的结果,但没有一个算法可以完全解决具有自然样本排序的表达数据集。如果实验中的样本可以根据它们彼此之间的预期相似性进行排序或部分排序,例如对于时间序列或发育过程、阶段或类型的研究(例如细胞类型、疾病、生长、衰老),那么就有独特的机会推断可能特定于潜在过程的 miR-mRNA 相互作用,而现有方法并没有利用这一点。我们提出了一种专门针对[部分]有序表达数据的算法,并利用基于排序结构的样本相似性。这是在贝叶斯框架内完成的,该框架为每个模型参数和潜在变量指定了后验分布,因此也指定了统计显著性。我们将我们的模型应用于之前发表的在五个部分有序条件下配对的 miR 和 mRNA 阵列的表达数据集,这些条件与多发性骨髓瘤有关,我们展示了考虑潜在排序如何能够改善 miR-mRNA 相互作用的推断,这可以通过涉及的转录本的现有知识来衡量。