Suppr超能文献

仅通过基因表达数据进行机器学习预测的发育基因调控网络连接。

Developmental gene regulatory network connections predicted by machine learning from gene expression data alone.

机构信息

Department of Computer Science, Virginia Tech, Blacksburg, VA, United States of America.

Department of Biology, Canisius College, Buffalo, NY, United States of America.

出版信息

PLoS One. 2021 Dec 28;16(12):e0261926. doi: 10.1371/journal.pone.0261926. eCollection 2021.

Abstract

Gene regulatory network (GRN) inference can now take advantage of powerful machine learning algorithms to complement traditional experimental methods in building gene networks. However, the dynamical nature of embryonic development-representing the time-dependent interactions between thousands of transcription factors, signaling molecules, and effector genes-is one of the most challenging arenas for GRN prediction. In this work, we show that successful GRN predictions for a developmental network from gene expression data alone can be obtained with the Priors Enriched Absent Knowledge (PEAK) network inference algorithm. PEAK is a noise-robust method that models gene expression dynamics via ordinary differential equations and selects the best network based on information-theoretic criteria coupled with the machine learning algorithm Elastic Net. We test our GRN prediction methodology using two gene expression datasets for the purple sea urchin, Stronglyocentrotus purpuratus, and cross-check our results against existing GRN models that have been constructed and validated by over 30 years of experimental results. Our results find a remarkably high degree of sensitivity in identifying known gene interactions in the network (maximum 81.58%). We also generate novel predictions for interactions that have not yet been described, which provide a resource for researchers to use to further complete the sea urchin GRN. Published ChIPseq data and spatial co-expression analysis further support a subset of the top novel predictions. We conclude that GRN predictions that match known gene interactions can be produced using gene expression data alone from developmental time series experiments.

摘要

基因调控网络(GRN)推断现在可以利用强大的机器学习算法来补充传统的实验方法,构建基因网络。然而,胚胎发育的动态性质——代表转录因子、信号分子和效应基因之间数千个时间依赖相互作用——是 GRN 预测最具挑战性的领域之一。在这项工作中,我们表明,通过先验富集缺失知识(PEAK)网络推断算法,仅从基因表达数据就可以获得发育网络的成功 GRN 预测。PEAK 是一种抗噪方法,通过常微分方程对基因表达动力学进行建模,并根据信息论标准以及机器学习算法弹性网络选择最佳网络。我们使用两个紫色海胆(Stronglyocentrotus purpuratus)的基因表达数据集来测试我们的 GRN 预测方法,并将结果与经过 30 多年的实验结果构建和验证的现有 GRN 模型进行交叉检查。我们的结果在识别网络中已知基因相互作用方面具有很高的灵敏度(最高 81.58%)。我们还对尚未描述的相互作用进行了新的预测,为研究人员提供了一个资源,以进一步完成海胆 GRN。已发表的 ChIPseq 数据和空间共表达分析进一步支持了一些最佳新预测。我们得出结论,仅从发育时间序列实验的基因表达数据就可以生成与已知基因相互作用匹配的 GRN 预测。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e58/8714117/8889a6da448d/pone.0261926.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验