Seçilmiş Deniz, Nelander Sven, Sonnhammer Erik L L
Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Solna, Sweden.
Science for Life Laboratory, Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden.
Front Genet. 2022 Jul 13;13:855770. doi: 10.3389/fgene.2022.855770. eCollection 2022.
Accurate inference of gene regulatory networks (GRNs) is important to unravel unknown regulatory mechanisms and processes, which can lead to the identification of treatment targets for genetic diseases. A variety of GRN inference methods have been proposed that, under suitable data conditions, perform well in benchmarks that consider the entire spectrum of false-positives and -negatives. However, it is very challenging to predict which single network sparsity gives the most accurate GRN. Lacking criteria for sparsity selection, a simplistic solution is to pick the GRN that has a certain number of links per gene, which is guessed to be reasonable. However, this does not guarantee finding the GRN that has the correct sparsity or is the most accurate one. In this study, we provide a general approach for identifying the most accurate and sparsity-wise relevant GRN within the entire space of possible GRNs. The algorithm, called SPA, applies a "GRN information criterion" (GRNIC) that is inspired by two commonly used model selection criteria, Akaike and Bayesian Information Criterion (AIC and BIC) but adapted to GRN inference. The results show that the approach can, in most cases, find the GRN whose sparsity is close to the true sparsity and close to as accurate as possible with the given GRN inference method and data. The datasets and source code can be found at https://bitbucket.org/sonnhammergrni/spa/.
准确推断基因调控网络(GRN)对于揭示未知的调控机制和过程至关重要,这有助于识别遗传疾病的治疗靶点。人们已经提出了多种GRN推断方法,在合适的数据条件下,这些方法在考虑了假阳性和假阴性全谱的基准测试中表现良好。然而,预测哪种单一网络稀疏度能给出最准确的GRN是非常具有挑战性的。由于缺乏稀疏度选择标准,一个简单的解决方案是选择每个基因具有一定数量连接的GRN,人们认为这是合理的。然而,这并不能保证找到具有正确稀疏度或最准确的GRN。在本研究中,我们提供了一种通用方法,用于在所有可能的GRN空间中识别最准确且与稀疏度相关的GRN。该算法称为SPA,应用了一种“GRN信息准则”(GRNIC),它受到两种常用模型选择准则——赤池信息准则和贝叶斯信息准则(AIC和BIC)的启发,但适用于GRN推断。结果表明,在大多数情况下,该方法能够找到稀疏度接近真实稀疏度且在给定的GRN推断方法和数据下尽可能准确的GRN。数据集和源代码可在https://bitbucket.org/sonnhammergrni/spa/获取。