Navas-Molina Jose A, Hyde Embriette R, Sanders Jon, Knight Rob
Department of Computer Science and Engineering, the University of California San Diego, 9300 Gilman Drive MC0736, La Jolla, CA 92093 USA.
Department of Pediatrics, the University of California San Diego, 9300 Gilman Drive MC0763, La Jolla, CA 92093, La Jolla, CA USA.
Curr Opin Syst Biol. 2017 Aug;4:92-96. doi: 10.1016/j.coisb.2017.07.003. Epub 2017 Jul 11.
Microbiome datasets have expanded rapidly in recent years. Advances in DNA sequencing, as well as the rise of shotgun metagenomics and metabolomics, are producing datasets that exceed the ability of researchers to analyze them on their personal computers. Here we describe what Big Data is in the context of microbiome research, how this data can be transformed into knowledge about microbes and their functions in their environments, and how the knowledge can be applied to move microbiome research forward. In particular, the development of new high-resolution tools to assess strain-level variability (moving away from OTUs), the advent of cloud computing and centralized analysis resources such as Qiita (for sequences) and GNPS (for mass spectrometry), and better methods for curating and describing "metadata" (contextual information about the sequence or chemical information) are rapidly assisting the use of microbiome data in fields ranging from human health to environmental studies.
近年来,微生物组数据集迅速扩展。DNA测序技术的进步,以及鸟枪法宏基因组学和代谢组学的兴起,正在产生一些研究人员无法在个人电脑上进行分析的数据集。在此,我们描述了微生物组研究背景下的大数据是什么,这些数据如何转化为关于微生物及其在环境中功能的知识,以及这些知识如何应用于推动微生物组研究向前发展。特别是,用于评估菌株水平变异性(不再依赖操作分类单元)的新型高分辨率工具的开发、云计算以及诸如Qiita(用于序列)和GNPS(用于质谱)等集中式分析资源的出现,以及用于整理和描述“元数据”(关于序列或化学信息的上下文信息)的更好方法,正在迅速推动微生物组数据在从人类健康到环境研究等各个领域的应用。