Brunner James D, Robinson Aaron J, Chain Patrick S G
Biosciences Division, Los Alamos National Laboratory, Los Alamos, NM, USA.
Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM, USA.
ISME Commun. 2024 Apr 19;4(1):ycae057. doi: 10.1093/ismeco/ycae057. eCollection 2024 Jan.
Microbial communities are diverse biological systems that include taxa from across multiple kingdoms of life. Notably, interactions between bacteria and fungi play a significant role in determining community structure. However, these statistical associations across kingdoms are more difficult to infer than intra-kingdom associations due to the nature of the data involved using standard network inference techniques. We quantify the challenges of cross-kingdom network inference from both theoretical and practical points of view using synthetic and real-world microbiome data. We detail the theoretical issue presented by combining compositional data sets drawn from the same environment, e.g. 16S and ITS sequencing of a single set of samples, and we survey common network inference techniques for their ability to handle this error. We then test these techniques for the accuracy and usefulness of their intra- and inter-kingdom associations by inferring networks from a set of simulated samples for which a ground-truth set of associations is known. We show that while the two methods mitigate the error of cross-kingdom inference, there is little difference between techniques for key practical applications including identification of strong correlations and identification of possible keystone taxa (i.e. hub nodes in the network). Furthermore, we identify a signature of the error caused by transkingdom network inference and demonstrate that it appears in networks constructed using real-world environmental microbiome data.
微生物群落是多样的生物系统,包含来自多个生命王国的分类群。值得注意的是,细菌和真菌之间的相互作用在决定群落结构方面起着重要作用。然而,由于使用标准网络推断技术所涉及的数据性质,跨王国的这些统计关联比王国内部的关联更难推断。我们使用合成和真实世界的微生物组数据,从理论和实践的角度量化跨王国网络推断的挑战。我们详细阐述了结合从同一环境中获取的成分数据集(例如,对单组样本进行16S和ITS测序)所呈现的理论问题,并考察了常见网络推断技术处理这种误差的能力。然后,我们通过从一组已知关联真相集的模拟样本推断网络,测试这些技术在王国内部和跨王国关联方面的准确性和实用性。我们表明,虽然这两种方法减轻了跨王国推断的误差,但在关键的实际应用技术之间几乎没有差异,包括识别强相关性和识别可能的关键分类群(即网络中的枢纽节点)。此外,我们识别出跨王国网络推断所导致误差的一个特征,并证明它出现在使用真实世界环境微生物组数据构建的网络中。