Xu Shuangbin, Zhan Li, Tang Wenli, Wang Qianwen, Dai Zehan, Zhou Lang, Feng Tingze, Chen Meijun, Wu Tianzhi, Hu Erqiang, Yu Guangchuang
Division of Laboratory Medicine, Microbiome Medicine Center, Zhujiang Hospital, Southern Medical University, Guangzhou 510515, China.
Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China.
Innovation (Camb). 2023 Feb 2;4(2):100388. doi: 10.1016/j.xinn.2023.100388. eCollection 2023 Mar 13.
The data output from microbiome research is growing at an accelerating rate, yet mining the data quickly and efficiently remains difficult. There is still a lack of an effective data structure to represent and manage data, as well as flexible and composable analysis methods. In response to these two issues, we designed and developed the package. It provides a comprehensive data structure, , to better integrate the primary and intermediate data, which improves the integration and exploration of the downstream data. Around this data structure, the downstream analysis tasks are decomposed and a set of functions are designed under a tidy framework. These functions independently perform simple tasks and can be combined to perform complex tasks. This gives users the ability to explore data, conduct personalized analyses, and develop analysis workflows. Moreover, can interoperate with other packages in the R community, which further expands its analytical capabilities. This article demonstrates the for analyzing microbiome data as well as other ecological data through several examples. It connects upstream data, provides flexible downstream analysis components, and provides visualization methods to assist in presenting and interpreting results.
微生物组研究的数据输出正以加速的速度增长,但快速有效地挖掘这些数据仍然很困难。目前仍然缺乏一种有效的数据结构来表示和管理数据,以及灵活且可组合的分析方法。针对这两个问题,我们设计并开发了该软件包。它提供了一种全面的数据结构,以更好地整合原始数据和中间数据,从而改善下游数据的整合与探索。围绕这一数据结构,下游分析任务被分解,并在一个整洁的框架下设计了一组函数。这些函数独立执行简单任务,并且可以组合起来执行复杂任务。这赋予了用户探索数据、进行个性化分析以及开发分析工作流程的能力。此外,它可以与R社区中的其他软件包进行互操作,这进一步扩展了其分析能力。本文通过几个示例展示了该软件包用于分析微生物组数据以及其他生态数据的情况。它连接上游数据,提供灵活的下游分析组件,并提供可视化方法以协助呈现和解释结果。