Cao Yang-Yang, Yomo Tetsuya, Ying Bei-Wen
Software Engineering Institute, East China Normal University, 3663 Zhong Shan Road (N), Shanghai 200062, China.
School of Life Science, East China Normal University, 3663 Zhong Shan Road (N), Shanghai 200062, China.
Microorganisms. 2020 Feb 26;8(3):331. doi: 10.3390/microorganisms8030331.
Bacterial growth curves, representing population dynamics, are still poorly understood. The growth curves are commonly analyzed by model-based theoretical fitting, which is limited to typical S-shape fittings and does not elucidate the dynamics in their entirety. Thus, whether a certain growth condition results in any particular pattern of growth curve remains unclear. To address this question, up-to-date data mining techniques were applied to bacterial growth analysis for the first time. Dynamic time warping (DTW) and derivative DTW (DDTW) were used to compare the similarity among 1015 growth curves of 28 strains growing in three different media. In the similarity evaluation, agglomerative hierarchical clustering, assessed with four statistic benchmarks, successfully categorized the growth curves into three clusters, roughly corresponding to the three media. Furthermore, a simple benchmark was newly proposed, providing a highly improved accuracy (~99%) in clustering the growth curves corresponding to the growth media. The biologically reasonable categorization of growth curves suggested that DTW and DDTW are applicable for bacterial growth analysis. The bottom-up clustering results indicate that the growth media determine some specific patterns of population dynamics, regardless of genomic variation, and thus have a higher priority of shaping the growth curves than the genomes do.
代表种群动态的细菌生长曲线,目前仍未被充分理解。生长曲线通常通过基于模型的理论拟合进行分析,这种方法仅限于典型的S形拟合,无法完整地阐明其动态变化。因此,某种生长条件是否会导致特定的生长曲线模式仍不明确。为了解决这个问题,最新的数据挖掘技术首次被应用于细菌生长分析。动态时间规整(DTW)和导数动态时间规整(DDTW)被用于比较28株细菌在三种不同培养基中生长的1015条生长曲线之间的相似性。在相似性评估中,通过四个统计基准进行评估的凝聚层次聚类成功地将生长曲线分为三类,大致对应于三种培养基。此外,新提出了一个简单的基准,在对与生长培养基相对应的生长曲线进行聚类时,提供了大幅提高的准确性(约99%)。生长曲线在生物学上合理的分类表明,DTW和DDTW适用于细菌生长分析。自底向上的聚类结果表明,生长培养基决定了一些特定的种群动态模式,而不受基因组变异的影响,因此在塑造生长曲线方面比基因组具有更高的优先级。