Suppr超能文献

从拓扑约束相关网络中推断有意义的群落。

Inferring meaningful communities from topology-constrained correlation networks.

作者信息

Hleap Jose Sergio, Blouin Christian

机构信息

Department of Biochemistry and Molecular Biology, Dalhouise University, Halifax, Nova Scotia, Canada.

Department of Biochemistry and Molecular Biology, Dalhouise University, Halifax, Nova Scotia, Canada; Department of Computer Science, Dalhouise University, Halifax, Nova Scotia, Canada.

出版信息

PLoS One. 2014 Nov 19;9(11):e113438. doi: 10.1371/journal.pone.0113438. eCollection 2014.

Abstract

Community structure detection is an important tool in graph analysis. This can be done, among other ways, by solving for the partition set which optimizes the modularity scores [Formula: see text]. Here it is shown that topological constraints in correlation graphs induce over-fragmentation of community structures. A refinement step to this optimization based on Linear Discriminant Analysis (LDA) and a statistical test for significance is proposed. In structured simulation constrained by topology, this novel approach performs better than the optimization of modularity alone. This method was also tested with two empirical datasets: the Roll-Call voting in the 110th US Senate constrained by geographic adjacency, and a biological dataset of 135 protein structures constrained by inter-residue contacts. The former dataset showed sub-structures in the communities that revealed a regional bias in the votes which transcend party affiliations. This is an interesting pattern given that the 110th Legislature was assumed to be a highly polarized government. The [Formula: see text]-amylase catalytic domain dataset (biological dataset) was analyzed with and without topological constraints (inter-residue contacts). The results without topological constraints showed differences with the topology constrained one, but the LDA filtering did not change the outcome of the latter. This suggests that the LDA filtering is a robust way to solve the possible over-fragmentation when present, and that this method will not affect the results where there is no evidence of over-fragmentation.

摘要

社区结构检测是图分析中的一项重要工具。除其他方法外,这可以通过求解优化模块性得分的划分集来实现[公式:见正文]。本文表明,相关图中的拓扑约束会导致社区结构过度碎片化。提出了基于线性判别分析(LDA)的优化细化步骤以及显著性统计检验。在受拓扑结构约束的结构化模拟中,这种新方法比单独的模块性优化表现更好。该方法还在两个实证数据集上进行了测试:受地理邻接性约束的美国第110届参议院唱名投票,以及受残基间接触约束的135个蛋白质结构的生物学数据集。前一个数据集显示了社区中的子结构,揭示了超越党派归属的投票区域偏差。鉴于第110届立法机构被认为是一个高度两极分化的政府,这是一个有趣的模式。对[公式:见正文]淀粉酶催化结构域数据集(生物学数据集)在有和没有拓扑约束(残基间接触)的情况下进行了分析。没有拓扑约束的结果与有拓扑约束的结果不同,但LDA过滤并没有改变后者的结果。这表明LDA过滤是解决可能存在的过度碎片化问题的一种稳健方法,并且该方法不会影响不存在过度碎片化证据时的结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/498f/4237410/d7e67693e830/pone.0113438.g001.jpg

相似文献

1
Inferring meaningful communities from topology-constrained correlation networks.
PLoS One. 2014 Nov 19;9(11):e113438. doi: 10.1371/journal.pone.0113438. eCollection 2014.
3
Network-constrained regularization and variable selection for analysis of genomic data.
Bioinformatics. 2008 May 1;24(9):1175-82. doi: 10.1093/bioinformatics/btn081. Epub 2008 Mar 1.
4
Scalable detection of statistically significant communities and hierarchies, using message passing for modularity.
Proc Natl Acad Sci U S A. 2014 Dec 23;111(51):18144-9. doi: 10.1073/pnas.1409770111. Epub 2014 Dec 8.
5
Community identification in networks with unbalanced structure.
Phys Rev E Stat Nonlin Soft Matter Phys. 2012 Jun;85(6 Pt 2):066114. doi: 10.1103/PhysRevE.85.066114. Epub 2012 Jun 13.
6
Linear discriminant analysis for signatures.
IEEE Trans Neural Netw. 2010 Dec;21(12):1990-6. doi: 10.1109/TNN.2010.2090047. Epub 2010 Nov 11.
8
A partition-based optimization model and its performance benchmark for Generative Anatomy Modeling Language.
Comput Biol Med. 2020 Apr;119:103695. doi: 10.1016/j.compbiomed.2020.103695. Epub 2020 Mar 5.
9
A DC programming approach for finding communities in networks.
Neural Comput. 2014 Dec;26(12):2827-54. doi: 10.1162/NECO_a_00673. Epub 2014 Sep 23.
10
Semisupervised generalized discriminant analysis.
IEEE Trans Neural Netw. 2011 Aug;22(8):1207-17. doi: 10.1109/TNN.2011.2156808. Epub 2011 Jun 30.

本文引用的文献

1
Systematic construction of kinetic models from genome-scale metabolic networks.
PLoS One. 2013 Nov 14;8(11):e79195. doi: 10.1371/journal.pone.0079195. eCollection 2013.
3
Social network analysis: foundations and frontiers on advantage.
Annu Rev Psychol. 2013;64:527-47. doi: 10.1146/annurev-psych-113011-143828.
4
Taxonomies of networks from community structure.
Phys Rev E Stat Nonlin Soft Matter Phys. 2012 Sep;86(3 Pt 2):036104-36104. doi: 10.1103/physreve.86.036104. Epub 2012 Sep 10.
5
Inferring correlation networks from genomic survey data.
PLoS Comput Biol. 2012;8(9):e1002687. doi: 10.1371/journal.pcbi.1002687. Epub 2012 Sep 20.
7
Network clustering: probing biological heterogeneity by sparse graphical models.
Bioinformatics. 2011 Apr 1;27(7):994-1000. doi: 10.1093/bioinformatics/btr070. Epub 2011 Feb 10.
8
Dominating clasp of the financial sector revealed by partial correlation analysis of the stock market.
PLoS One. 2010 Dec 20;5(12):e15032. doi: 10.1371/journal.pone.0015032.
9
Dynamic correlation networks in human peroxisome proliferator-activated receptor-γ nuclear receptor protein.
Eur Biophys J. 2010 Oct;39(11):1503-12. doi: 10.1007/s00249-010-0608-9. Epub 2010 May 23.
10
The Pfam protein families database.
Nucleic Acids Res. 2010 Jan;38(Database issue):D211-22. doi: 10.1093/nar/gkp985. Epub 2009 Nov 17.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验