Röttjers Lisa, Faust Karoline
Laboratory of Molecular Bacteriology (Rega Institute), Department of Microbiology, Immunology and Transplantation, KU Leuven, Leuven, Belgium.
Laboratory of Molecular Bacteriology (Rega Institute), Department of Microbiology, Immunology and Transplantation, KU Leuven, Leuven, Belgium
mSystems. 2020 Feb 18;5(1):e00903-19. doi: 10.1128/mSystems.00903-19.
Microbial network inference and analysis have become successful approaches to extract biological hypotheses from microbial sequencing data. Network clustering is a crucial step in this analysis. Here, we present a novel heuristic network clustering algorithm, manta, which clusters nodes in weighted networks. In contrast to existing algorithms, manta exploits negative edges while differentiating between weak and strong cluster assignments. For this reason, manta can tackle gradients and is able to avoid clustering problematic nodes. In addition, manta assesses the robustness of cluster assignment, which makes it more robust to noisy data than most existing tools. On noise-free synthetic data, manta equals or outperforms existing algorithms, while it identifies biologically relevant subcompositions in real-world data sets. On a cheese rind data set, manta identifies groups of taxa that correspond to intermediate moisture content in the rinds, while on an ocean data set, the algorithm identifies a cluster of organisms that were reduced in abundance during a transition period but did not correlate strongly to biochemical parameters that changed during the transition period. These case studies demonstrate the power of manta as a tool that identifies biologically informative groups within microbial networks. manta comes with unique strengths, such as the abilities to identify nodes that represent an intermediate between clusters, to exploit negative edges, and to assess the robustness of cluster membership. manta does not require parameter tuning, is straightforward to install and run, and can be easily combined with existing microbial network inference tools.
微生物网络推断与分析已成为从微生物测序数据中提取生物学假设的成功方法。网络聚类是该分析中的关键步骤。在此,我们提出一种新颖的启发式网络聚类算法——蝠鲼算法(manta),它能对加权网络中的节点进行聚类。与现有算法不同,蝠鲼算法在区分弱聚类和强聚类分配时会利用负边。因此,蝠鲼算法能够处理梯度问题,并能避免对有问题的节点进行聚类。此外,蝠鲼算法会评估聚类分配的稳健性,这使其在面对噪声数据时比大多数现有工具更稳健。在无噪声的合成数据上,蝠鲼算法与现有算法相当或更胜一筹,同时它能在实际数据集中识别出具有生物学相关性的子成分。在一个奶酪外皮数据集上,蝠鲼算法识别出与外皮中含水量中等相对应的分类群组,而在一个海洋数据集上,该算法识别出一群在过渡期间丰度降低但与过渡期间变化的生化参数相关性不强的生物。这些案例研究证明了蝠鲼算法作为一种能在微生物网络中识别具有生物学信息的群组的工具的强大能力。蝠鲼算法具有独特的优势,比如能够识别代表聚类之间中间状态的节点、利用负边以及评估聚类成员的稳健性。蝠鲼算法不需要进行参数调整,安装和运行都很简单,并且可以很容易地与现有的微生物网络推断工具结合使用。