Structural & Molecular Biology Faculty of Life Sciences, UCL, London, UK.
Biochem Mol Biol Educ. 2022 Sep;50(5):446-449. doi: 10.1002/bmb.21646. Epub 2022 Aug 16.
The final year of a biochemistry degree is usually a time to experience research. However, laboratory-based research projects were not possible during COVID-19. Instead, we used open datasets to provide computational research projects in metagenomics to biochemistry undergraduates (80 students with limited computing experience). We aimed to give the students a chance to explore any dataset, rather than use a small number of artificial datasets (~60 published datasets were used). To achieve this, we utilized Google Colaboratory (Colab), a virtual computing environment. Colab was used as a framework to retrieve raw sequencing data (analyzed with QIIME2) and generate visualizations. Setting up the environment requires no prior experience; all students have the same drive structure and notebooks can be shared (for synchronous sessions). We also used the platform to combine multiple datasets, perform a meta-analysis, and allowed the students to analyze large datasets with 1000s of subjects and factors. Projects that required increased computational resources were integrated with Google Cloud Compute. In future, all research projects can include some aspects of reanalyzing public data, providing students with data science experience. Colab is also an excellent environment in which to develop data skills in multiple languages (e.g., Perl, Python, Julia).
生物化学专业的最后一年通常是进行研究的时间。然而,在 COVID-19 期间,基于实验室的研究项目是不可能进行的。因此,我们使用开放数据集为生物化学专业的本科生(具有有限计算经验的 80 名学生)提供计算型宏基因组学研究项目。我们的目标是让学生有机会探索任何数据集,而不是使用少数人工数据集(~60 个已发表的数据集被使用)。为了实现这一目标,我们利用了 Google Colaboratory(Colab),这是一种虚拟计算环境。Colab 被用作检索原始测序数据(用 QIIME2 分析)并生成可视化效果的框架。设置环境不需要事先的经验;所有学生都具有相同的驱动结构,并且可以共享笔记本(用于同步会话)。我们还使用该平台来组合多个数据集,进行元分析,并允许学生分析具有数千个主题和因素的大型数据集。需要增加计算资源的项目与 Google Cloud Compute 集成在一起。将来,所有研究项目都可以包括重新分析公共数据的某些方面,为学生提供数据科学经验。Colab 也是一个极好的环境,可以用多种语言(例如 Perl、Python、Julia)开发数据技能。