• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种使用预测性最小描述长度方法的新型基因网络推理算法。

A novel gene network inference algorithm using predictive minimum description length approach.

作者信息

Chaitankar Vijender, Ghosh Preetam, Perkins Edward J, Gong Ping, Deng Youping, Zhang Chaoyang

机构信息

School of Computing, University of Southern Mississippi, MS 39402, USA.

出版信息

BMC Syst Biol. 2010 May 28;4 Suppl 1(Suppl 1):S7. doi: 10.1186/1752-0509-4-S1-S7.

DOI:10.1186/1752-0509-4-S1-S7
PMID:20522257
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2880413/
Abstract

BACKGROUND

Reverse engineering of gene regulatory networks using information theory models has received much attention due to its simplicity, low computational cost, and capability of inferring large networks. One of the major problems with information theory models is to determine the threshold which defines the regulatory relationships between genes. The minimum description length (MDL) principle has been implemented to overcome this problem. The description length of the MDL principle is the sum of model length and data encoding length. A user-specified fine tuning parameter is used as control mechanism between model and data encoding, but it is difficult to find the optimal parameter. In this work, we proposed a new inference algorithm which incorporated mutual information (MI), conditional mutual information (CMI) and predictive minimum description length (PMDL) principle to infer gene regulatory networks from DNA microarray data. In this algorithm, the information theoretic quantities MI and CMI determine the regulatory relationships between genes and the PMDL principle method attempts to determine the best MI threshold without the need of a user-specified fine tuning parameter.

RESULTS

The performance of the proposed algorithm was evaluated using both synthetic time series data sets and a biological time series data set for the yeast Saccharomyces cerevisiae. The benchmark quantities precision and recall were used as performance measures. The results show that the proposed algorithm produced less false edges and significantly improved the precision, as compared to the existing algorithm. For further analysis the performance of the algorithms was observed over different sizes of data.

CONCLUSIONS

We have proposed a new algorithm that implements the PMDL principle for inferring gene regulatory networks from time series DNA microarray data that eliminates the need of a fine tuning parameter. The evaluation results obtained from both synthetic and actual biological data sets show that the PMDL principle is effective in determining the MI threshold and the developed algorithm improves precision of gene regulatory network inference. Based on the sensitivity analysis of all tested cases, an optimal CMI threshold value has been identified. Finally it was observed that the performance of the algorithms saturates at a certain threshold of data size.

摘要

背景

利用信息论模型对基因调控网络进行逆向工程因其简单性、低计算成本以及推断大型网络的能力而备受关注。信息论模型的一个主要问题是确定定义基因间调控关系的阈值。最小描述长度(MDL)原则已被用于克服这一问题。MDL原则的描述长度是模型长度和数据编码长度之和。一个用户指定的微调参数被用作模型与数据编码之间的控制机制,但很难找到最优参数。在这项工作中,我们提出了一种新的推理算法,该算法结合互信息(MI)、条件互信息(CMI)和预测最小描述长度(PMDL)原则,从DNA微阵列数据推断基因调控网络。在该算法中,信息论量MI和CMI确定基因间的调控关系,而PMDL原则方法试图确定最佳的MI阈值,无需用户指定的微调参数。

结果

使用合成时间序列数据集和酿酒酵母的生物时间序列数据集对所提出算法的性能进行了评估。基准量精度和召回率被用作性能度量。结果表明,与现有算法相比,所提出的算法产生的错误边更少,显著提高了精度。为了进一步分析,观察了算法在不同数据大小下的性能。

结论

我们提出了一种新算法,该算法实现了PMDL原则,用于从时间序列DNA微阵列数据推断基因调控网络,无需微调参数。从合成和实际生物数据集获得的评估结果表明,PMDL原则在确定MI阈值方面是有效的,并且所开发的算法提高了基因调控网络推断的精度。基于所有测试案例的敏感性分析,确定了一个最优的CMI阈值。最后观察到,算法的性能在一定数据大小阈值处达到饱和。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a88c/2880413/49b5e7a363ee/1752-0509-4-S1-S7-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a88c/2880413/e33c97a2982f/1752-0509-4-S1-S7-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a88c/2880413/1863dfc94fea/1752-0509-4-S1-S7-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a88c/2880413/509af9101e75/1752-0509-4-S1-S7-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a88c/2880413/8ede3f015647/1752-0509-4-S1-S7-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a88c/2880413/9609b49ec855/1752-0509-4-S1-S7-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a88c/2880413/a2f9124c190b/1752-0509-4-S1-S7-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a88c/2880413/c8a6b6444894/1752-0509-4-S1-S7-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a88c/2880413/49b5e7a363ee/1752-0509-4-S1-S7-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a88c/2880413/e33c97a2982f/1752-0509-4-S1-S7-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a88c/2880413/1863dfc94fea/1752-0509-4-S1-S7-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a88c/2880413/509af9101e75/1752-0509-4-S1-S7-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a88c/2880413/8ede3f015647/1752-0509-4-S1-S7-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a88c/2880413/9609b49ec855/1752-0509-4-S1-S7-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a88c/2880413/a2f9124c190b/1752-0509-4-S1-S7-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a88c/2880413/c8a6b6444894/1752-0509-4-S1-S7-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a88c/2880413/49b5e7a363ee/1752-0509-4-S1-S7-8.jpg

相似文献

1
A novel gene network inference algorithm using predictive minimum description length approach.一种使用预测性最小描述长度方法的新型基因网络推理算法。
BMC Syst Biol. 2010 May 28;4 Suppl 1(Suppl 1):S7. doi: 10.1186/1752-0509-4-S1-S7.
2
Predictive minimum description length principle approach to inferring gene regulatory networks.预测最小描述长度原理方法推断基因调控网络。
Adv Exp Med Biol. 2011;696:37-43. doi: 10.1007/978-1-4419-7046-6_4.
3
MICRAT: a novel algorithm for inferring gene regulatory networks using time series gene expression data.MICRAT:一种使用时间序列基因表达数据推断基因调控网络的新算法。
BMC Syst Biol. 2018 Dec 14;12(Suppl 7):115. doi: 10.1186/s12918-018-0635-1.
4
Time lagged information theoretic approaches to the reverse engineering of gene regulatory networks.时滞信息论方法在基因调控网络反向工程中的应用。
BMC Bioinformatics. 2010 Oct 7;11 Suppl 6(Suppl 6):S19. doi: 10.1186/1471-2105-11-S6-S19.
5
TimeDelay-ARACNE: Reverse engineering of gene networks from time-course data by an information theoretic approach.时滞 ARACNE:基于信息论方法从时间序列数据中反向工程基因网络。
BMC Bioinformatics. 2010 Mar 25;11:154. doi: 10.1186/1471-2105-11-154.
6
Using the minimum description length principle to reduce the rate of false positives of best-fit algorithms.使用最小描述长度原则降低最佳拟合算法的误报率。
EURASIP J Bioinform Syst Biol. 2014 Jul 3;2014:13. doi: 10.1186/s13637-014-0013-2. eCollection 2014 Dec.
7
CN: a consensus algorithm for inferring gene regulatory networks using the SORDER algorithm and conditional mutual information test.中文:一种使用SORDER算法和条件互信息检验来推断基因调控网络的共识算法。
Mol Biosyst. 2015 Mar;11(3):942-9. doi: 10.1039/c4mb00413b. Epub 2015 Jan 21.
8
Inference of Gene Regulatory Network Based on Local Bayesian Networks.基于局部贝叶斯网络的基因调控网络推理
PLoS Comput Biol. 2016 Aug 1;12(8):e1005024. doi: 10.1371/journal.pcbi.1005024. eCollection 2016 Aug.
9
Inferring connectivity of genetic regulatory networks using information-theoretic criteria.使用信息论标准推断遗传调控网络的连通性。
IEEE/ACM Trans Comput Biol Bioinform. 2008 Apr-Jun;5(2):262-74. doi: 10.1109/TCBB.2007.1067.
10
HSCVFNT: Inference of Time-Delayed Gene Regulatory Network Based on Complex-Valued Flexible Neural Tree Model.基于复值柔性神经树模型的时滞基因调控网络推断
Int J Mol Sci. 2018 Oct 15;19(10):3178. doi: 10.3390/ijms19103178.

引用本文的文献

1
GramSeq-DTA: A Grammar-Based Drug-Target Affinity Prediction Approach Fusing Gene Expression Information.GramSeq-DTA:一种融合基因表达信息的基于语法的药物-靶点亲和力预测方法。
Biomolecules. 2025 Mar 12;15(3):405. doi: 10.3390/biom15030405.
2
Heterogeneous Clustering of Multiomics Data for Breast Cancer Subgroup Classification and Detection.用于乳腺癌亚组分类和检测的多组学数据的异质性聚类
Int J Mol Sci. 2025 Feb 17;26(4):1707. doi: 10.3390/ijms26041707.
3
CORTADO: Hill Climbing Optimization for Cell-Type Specific Marker Gene Discovery.

本文引用的文献

1
Gene regulatory network inference: data integration in dynamic models-a review.基因调控网络推断:动态模型中的数据整合——综述
Biosystems. 2009 Apr;96(1):86-103. doi: 10.1016/j.biosystems.2008.12.004. Epub 2008 Dec 27.
2
Inferring connectivity of genetic regulatory networks using information-theoretic criteria.使用信息论标准推断遗传调控网络的连通性。
IEEE/ACM Trans Comput Biol Bioinform. 2008 Apr-Jun;5(2):262-74. doi: 10.1109/TCBB.2007.1067.
3
Inference of gene regulatory networks based on a universal minimum description length.
科尔塔多:用于细胞类型特异性标记基因发现的爬山优化算法
bioRxiv. 2024 Dec 23:2024.12.23.630040. doi: 10.1101/2024.12.23.630040.
4
COFFEE: consensus single cell-type specific inference for gene regulatory networks.咖啡:用于基因调控网络的共识单细胞特异性推断。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae457.
5
CHAI: consensus clustering through similarity matrix integration for cell-type identification.CHAI:通过相似性矩阵集成进行共识聚类,以进行细胞类型识别。
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae411.
6
COFFEE: Consensus Single Cell-Type Specific Inference for Gene Regulatory Networks.COFFEE:基因调控网络的共识单细胞类型特异性推断
bioRxiv. 2024 Jan 8:2024.01.05.574445. doi: 10.1101/2024.01.05.574445.
7
Information Theory in Computational Biology: Where We Stand Today.计算生物学中的信息论:我们如今的现状
Entropy (Basel). 2020 Jun 6;22(6):627. doi: 10.3390/e22060627.
8
Co-Expression Networks for Causal Gene Identification Based on RNA-Seq Data of .基于. 的 RNA-Seq 数据的因果基因识别的共表达网络
Genes (Basel). 2020 Jul 14;11(7):794. doi: 10.3390/genes11070794.
9
MICRAT: a novel algorithm for inferring gene regulatory networks using time series gene expression data.MICRAT:一种使用时间序列基因表达数据推断基因调控网络的新算法。
BMC Syst Biol. 2018 Dec 14;12(Suppl 7):115. doi: 10.1186/s12918-018-0635-1.
10
Using the minimum description length principle to reduce the rate of false positives of best-fit algorithms.使用最小描述长度原则降低最佳拟合算法的误报率。
EURASIP J Bioinform Syst Biol. 2014 Jul 3;2014:13. doi: 10.1186/s13637-014-0013-2. eCollection 2014 Dec.
基于通用最小描述长度的基因调控网络推理
EURASIP J Bioinform Syst Biol. 2008;2008(1):482090. doi: 10.1155/2008/482090.
4
KEGG for linking genomes to life and the environment.京都基因与基因组百科全书,用于将基因组与生命及环境相联系。
Nucleic Acids Res. 2008 Jan;36(Database issue):D480-4. doi: 10.1093/nar/gkm882. Epub 2007 Dec 12.
5
Inferring gene regulatory networks from time series data using the minimum description length principle.利用最小描述长度原理从时间序列数据推断基因调控网络。
Bioinformatics. 2006 Sep 1;22(17):2129-35. doi: 10.1093/bioinformatics/btl364. Epub 2006 Jul 15.
6
ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context.ARACNE:一种用于在哺乳动物细胞环境中重建基因调控网络的算法。
BMC Bioinformatics. 2006 Mar 20;7 Suppl 1(Suppl 1):S7. doi: 10.1186/1471-2105-7-S1-S7.
7
From genomics to chemical genomics: new developments in KEGG.从基因组学到化学基因组学:KEGG的新进展
Nucleic Acids Res. 2006 Jan 1;34(Database issue):D354-7. doi: 10.1093/nar/gkj102.
8
Probabilistic Boolean Networks: a rule-based uncertainty model for gene regulatory networks.概率布尔网络:一种用于基因调控网络的基于规则的不确定性模型。
Bioinformatics. 2002 Feb;18(2):261-74. doi: 10.1093/bioinformatics/18.2.261.
9
Algorithms for inferring qualitative models of biological networks.用于推断生物网络定性模型的算法。
Pac Symp Biocomput. 2000:293-304. doi: 10.1142/9789814447331_0028.
10
KEGG: kyoto encyclopedia of genes and genomes.京都基因与基因组百科全书(KEGG)
Nucleic Acids Res. 2000 Jan 1;28(1):27-30. doi: 10.1093/nar/28.1.27.