IEEE/ACM Trans Comput Biol Bioinform. 2022 Jan-Feb;19(1):284-294. doi: 10.1109/TCBB.2020.3003018. Epub 2022 Feb 3.
Motif discovery and network clustering in complex networks have received a lot of attention in recent years, also they are still challenging tasks in bioinformatics, big data analytics and data mining applications. Motif discovery in big data networks has a lot of important applications in different domains such as engineering, bioinformatics, cheminformatics, genomics, sociology and ecology for revealing hidden frequent structures, functional building blocks, or knowledge discovery. In this paper, a motif localization method based on a novel clustering algorithm in complex networks is presented. In our method, for each complex network, a novel structure so-called Augmented Multiresolution Network (AMN) is generated, then it is adaptively partitioned into several clusters and their corresponding subnets. Then top ranked subnets are chosen to discover network motifs. We show that the proposed method provides an efficient solution for clustering and motif discovery; It speeds up current motif discovery algorithms by pruning non-promising regions of complex networks. Experimental results show our algorithm efficiently deals with complex networks representing large datasets with high-dimensionality such as big scientific data. Our method also provides motivations for future studies in big data and complex networks.
近年来,复杂网络中的模式发现和网络聚类受到了广泛关注,它们仍然是生物信息学、大数据分析和数据挖掘应用中的具有挑战性的任务。大数据网络中的模式发现,在工程、生物信息学、化学信息学、基因组学、社会学和生态学等不同领域具有许多重要的应用,可用于揭示隐藏的频繁结构、功能构建块或知识发现。在本文中,提出了一种基于复杂网络中新型聚类算法的模式定位方法。在我们的方法中,对于每个复杂网络,生成一个新颖的结构,称为增强多分辨率网络(AMN),然后自适应地将其划分为几个簇及其对应的子网。然后选择排名最高的子网来发现网络模式。我们表明,所提出的方法为聚类和模式发现提供了有效的解决方案;它通过修剪复杂网络中不希望出现的区域,加速了当前的模式发现算法。实验结果表明,我们的算法有效地处理了代表大数据集的复杂网络,这些网络具有高维性,如大型科学数据。我们的方法也为大数据和复杂网络的未来研究提供了动力。