Neely Benjamin A
Chemical Sciences Division, National Institute of Standards and Technology, Charleston, South Carolina 29412, United States.
J Proteome Res. 2021 Apr 2;20(4):2076-2082. doi: 10.1021/acs.jproteome.0c00920. Epub 2021 Jan 29.
Cloud-hosted environments offer known benefits when computational needs outstrip affordable local workstations, enabling high-performance computation without a physical cluster. What has been less apparent, especially to novice users, is the transformative potential for cloud-hosted environments to bridge the digital divide that exists between poorly funded and well-resourced laboratories, and to empower modern research groups with remote personnel and trainees. Using cloud-based proteomic bioinformatic pipelines is not predicated on analyzing thousands of files, but instead can be used to improve accessibility during remote work, extreme weather, or working with under-resourced remote trainees. The general benefits of cloud-hosted environments also allow for scalability and encourage reproducibility. Since one possible hurdle to adoption is awareness, this paper is written with the nonexpert in mind. The benefits and possibilities of using a cloud-hosted environment are emphasized by describing how to setup an example workflow to analyze a previously published label-free data-dependent acquisition mass spectrometry data set of mammalian urine. Cost and time of analysis are compared using different computational tiers, and important practical considerations are described. Overall, cloud-hosted environments offer the potential to solve large computational problems, but more importantly can enable and accelerate research in smaller research groups with inadequate infrastructure and suboptimal local computational resources.
当计算需求超过负担得起的本地工作站时,云托管环境具有已知的优势,可在无需物理集群的情况下实现高性能计算。对于新手用户来说,尤其是不那么明显的是,云托管环境具有变革潜力,可弥合资金不足和资源充足的实验室之间存在的数字鸿沟,并为拥有远程人员和学员的现代研究团队赋能。使用基于云的蛋白质组学生物信息学管道并非基于分析数千个文件,而是可用于在远程工作、极端天气或与资源不足的远程学员合作期间提高可及性。云托管环境的一般优势还允许扩展并鼓励可重复性。由于采用的一个可能障碍是认知度,本文是为非专业人士而写的。通过描述如何设置一个示例工作流程来分析先前发表的哺乳动物尿液的无标记数据依赖采集质谱数据集,强调了使用云托管环境的好处和可能性。使用不同的计算层级比较了分析成本和时间,并描述了重要的实际注意事项。总体而言,云托管环境有潜力解决大型计算问题,但更重要的是,它可以在基础设施不足和本地计算资源欠佳的较小研究团队中推动并加速研究。