Department of Bioengineering, Krasnow Institute for Advanced Study, George Mason University, 4400 University Dr, MS2A1, Fairfax, 22030, VA, USA.
BMC Genomics. 2018 Sep 24;19(Suppl 7):668. doi: 10.1186/s12864-018-5025-y.
In silico investigations on the integration of multiple datasets are in need of higher statistical power methods to unveil secondary findings that were hidden from the initial analyses. We present here a novel method for the network analysis of messenger RNA post-translational regulation by microRNA molecules. The method integrates expression data and sequence binding predictions through a set of sound machine learning techniques, forwarding all results to an ensemble graph of regulations.
Bayesian network classifiers are induced based on a pool of ensemble graphs with ascending order of complexity. Individual goodness-of-fit and classification performances are evaluated for each learned model. As a testbed, four Alzheimer's disease datasets are integrated using the new approach, achieving top values of 0.9794 ± 0.01 for the area under the receiver operating characteristic curve and 0.9439 ± 0.0234 for the prediction accuracy.
Post-transcriptional regulations found by the optimal network classifier concur with previous literature findings. Furthermore, additional network structures suggest previously unreported regulations in the state of the art of Alzheimer's research. The quantitative performance as well as sound biological findings provide confidence in the ensemble approach and encourage similar integrative analyses for other conditions.
对多个数据集进行计算机模拟研究需要更高的统计能力方法来揭示最初分析中隐藏的次要发现。我们在这里提出了一种新的方法,用于通过一组可靠的机器学习技术对信使 RNA 翻译后调控的 microRNA 分子进行网络分析。该方法通过一系列稳健的机器学习技术将表达数据和序列结合预测进行整合,将所有结果转发到调控的整体图中。
基于具有递增复杂度的整体图集合,诱导出贝叶斯网络分类器。为每个学习模型评估个体拟合度和分类性能。作为测试平台,使用新方法整合了四个阿尔茨海默病数据集,实现了接收者操作特征曲线下面积的最高值 0.9794 ± 0.01,预测准确性的最高值为 0.9439 ± 0.0234。
最优网络分类器发现的转录后调控与先前的文献发现一致。此外,额外的网络结构表明在阿尔茨海默病研究的最新进展中存在以前未报道的调控。定量性能以及合理的生物学发现为集成方法提供了信心,并鼓励对其他情况进行类似的综合分析。