J. Craig Venter Institute, La Jolla, USA.
Applied Sciences, Durban University of Technology, Durban, South Africa.
Environ Microbiol. 2020 Aug;22(8):3020-3038. doi: 10.1111/1462-2920.15091. Epub 2020 Jun 22.
Next-generation sequencing technologies have generated, and continue to produce, an increasingly large corpus of biological data. The data generated are inherently compositional as they convey only relative information dependent upon the capacity of the instrument, experimental design and technical bias. There is considerable information to be gained through network analysis by studying the interactions between components within a system. Network theory methods using compositional data are powerful approaches for quantifying relationships between biological components and their relevance to phenotype, environmental conditions or other external variables. However, many of the statistical assumptions used for network analysis are not designed for compositional data and can bias downstream results. In this mini-review, we illustrate the utility of network theory in biological systems and investigate modern techniques while introducing researchers to frameworks for implementation. We overview (1) compositional data analysis, (2) data transformations and (3) network theory along with insight on a battery of network types including static-, temporal-, sample-specific- and differential-networks. The intention of this mini-review is not to provide a comprehensive overview of network methods, rather to introduce microbiology researchers to (semi)-unsupervised data-driven approaches for inferring latent structures that may give insight into biological phenomena or abstract mechanics of complex systems.
下一代测序技术产生了,并将继续产生越来越多的生物数据。这些数据本质上是组合的,因为它们只传达相对信息,依赖于仪器的能力、实验设计和技术偏差。通过研究系统内各组成部分之间的相互作用,通过网络分析可以获得大量信息。使用组合数据的网络理论方法是定量分析生物成分之间关系及其与表型、环境条件或其他外部变量相关性的有力方法。然而,网络分析中使用的许多统计假设并不是为组合数据设计的,可能会对下游结果产生偏差。在这篇迷你综述中,我们说明了网络理论在生物系统中的实用性,并研究了现代技术,同时为研究人员介绍了实现的框架。我们概述了(1)组合数据分析、(2)数据转换和(3)网络理论,以及对一系列网络类型的深入了解,包括静态网络、时间网络、样本特定网络和差异网络。本篇迷你综述的目的不是提供网络方法的全面概述,而是向微生物学研究人员介绍(半)无监督的数据驱动方法,以推断可能深入了解生物现象或复杂系统抽象力学的潜在结构。