Schiffthaler Bastian, van Zalen Elena, Serrano Alonso R, Street Nathaniel R, Delhomme Nicolas
Department of Plant Physiology, Umea Plant Science Center, Umea University, Umea, Sweden.
Department of Plant Physiology, Umea Plant Science Center, Swedish University of Agricultural Sciences, Umea, Sweden.
Heliyon. 2023 May 31;9(6):e16811. doi: 10.1016/j.heliyon.2023.e16811. eCollection 2023 Jun.
Gene regulatory and gene co-expression networks are powerful research tools for identifying biological signal within high-dimensional gene expression data. In recent years, research has focused on addressing shortcomings of these techniques with regard to the low signal-to-noise ratio, non-linear interactions and dataset dependent biases of published methods. Furthermore, it has been shown that aggregating networks from multiple methods provides improved results. Despite this, few useable and scalable software tools have been implemented to perform such best-practice analyses. Here, we present Seidr (stylized Seiðr), a software toolkit designed to assist scientists in gene regulatory and gene co-expression network inference. Seidr creates community networks to reduce algorithmic bias and utilizes noise corrected network backboning to prune noisy edges in the networks. Using benchmarks in real-world conditions across three eukaryotic model organisms, , , and , we show that individual algorithms are biased toward functional evidence for certain gene-gene interactions. We further demonstrate that the community network is less biased, providing robust performance across different standards and comparisons for the model organisms. Finally, we apply Seidr to a network of drought stress in Norway spruce (Picea abies (L.) H. Krast) as an example application in a non-model species. We demonstrate the use of a network inferred using Seidr for identifying key components, communities and suggesting gene function for non-annotated genes.
基因调控网络和基因共表达网络是用于在高维基因表达数据中识别生物信号的强大研究工具。近年来,研究主要集中在解决这些技术在低信噪比、非线性相互作用以及已发表方法中依赖数据集的偏差等方面的不足。此外,研究表明,整合多种方法构建的网络能得到更好的结果。尽管如此,很少有实用且可扩展的软件工具来执行此类最佳实践分析。在此,我们介绍Seidr(风格化的Seiðr),这是一个旨在协助科学家进行基因调控和基因共表达网络推断的软件工具包。Seidr创建群落网络以减少算法偏差,并利用噪声校正的网络主干化来修剪网络中的噪声边。通过对三种真核模式生物(分别为酿酒酵母、秀丽隐杆线虫和果蝇)在实际条件下的基准测试,我们发现个别算法对于某些基因 - 基因相互作用偏向于功能证据。我们进一步证明,群落网络的偏差较小,在针对模式生物的不同标准和比较中都具有稳健的性能。最后,我们将Seidr应用于挪威云杉(Picea abies (L.) H. Krast)的干旱胁迫网络,作为在非模式物种中的一个示例应用。我们展示了使用Seidr推断的网络来识别关键组件、群落以及为未注释基因推测基因功能。