Suppr超能文献

组学工作流程的可扩展内存处理。

Scalable in-memory processing of omics workflows.

作者信息

Elisseev Vadim, Gardiner Laura-Jayne, Krishna Ritesh

机构信息

IBM Research Europe, Hartree Centre, Daresbury Laboratory, Keckwick Lane, WarringtonWA4 4AD, Cheshire, UK.

Wrexham Glyndwr University, Mold Rd, Wrexham LL11 2AW, Wales, UK.

出版信息

Comput Struct Biotechnol J. 2022 Apr 20;20:1914-1924. doi: 10.1016/j.csbj.2022.04.014. eCollection 2022.

Abstract

We present a proof of concept implementation of the in-memory computing paradigm that we use to facilitate the analysis of metagenomic sequencing reads. In doing so we compare the performance of POSIX™file systems and key-value storage for omics data, and we show the potential for integrating high-performance computing (HPC) and cloud native technologies. We show that in-memory key-value storage offers possibilities for improved handling of omics data through more flexible and faster data processing. We envision fully containerized workflows and their deployment in portable micro-pipelines with multiple instances working concurrently with the same distributed in-memory storage. To highlight the potential usage of this technology for event driven and real-time data processing, we use a biological case study focused on the growing threat of antimicrobial resistance (AMR). We develop a workflow encompassing bioinformatics and explainable machine learning (ML) to predict life expectancy of a population based on the microbiome of its sewage while providing a description of AMR contribution to the prediction. We propose that in future, performing such analyses in 'real-time' would allow us to assess the potential risk to the population based on changes in the AMR profile of the community.

摘要

我们展示了一种用于促进宏基因组测序读数分析的内存计算范式的概念验证实现。在此过程中,我们比较了POSIX™文件系统和用于组学数据的键值存储的性能,并展示了集成高性能计算(HPC)和云原生技术的潜力。我们表明,内存键值存储通过更灵活、更快的数据处理为改进组学数据处理提供了可能性。我们设想了完全容器化的工作流程及其在便携式微管道中的部署,多个实例可与同一分布式内存存储并发工作。为了突出该技术在事件驱动和实时数据处理方面的潜在用途,我们使用了一个关注抗菌药物耐药性(AMR)日益增长威胁的生物学案例研究。我们开发了一个包含生物信息学和可解释机器学习(ML)的工作流程,以根据污水微生物群预测人群的预期寿命,同时描述AMR对预测的贡献。我们提出,未来进行此类“实时”分析将使我们能够根据社区AMR谱的变化评估人群面临的潜在风险。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f025/9052061/62b60adc18b3/ga1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验