Schmitt Charles P, Stingone Jeanette A, Rajasekar Arcot, Cui Yuxia, Du Xiuxia, Duncan Chris, Heacock Michelle, Hu Hui, Gonzalez Juan R, Juarez Paul D, Smirnov Alex I
Office of Data Science, National Institute of Environmental Health Sciences, Durham, NC, USA.
Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, New York, USA.
Exposome. 2023;3(1). doi: 10.1093/exposome/osad010. Epub 2023 Nov 14.
The scale of the human exposome, which covers all environmental exposures encountered from conception to death, presents major challenges in managing, sharing, and integrating a myriad of relevant data types and available data sets for the benefit of exposomics research and public health. By addressing these challenges, the exposomics research community will be able to greatly expand on its ability to aggregate study data for new discoveries, construct and update novel exposomics data sets for building artificial intelligence and machine learning-based models, rapidly survey emerging issues, and advance the application of data-driven science. The diversity of the field, which spans multiple subfields of science disciplines and different environmental contexts, necessitates adopting data federation approaches to bridge between numerous geographically and administratively separated data resources that have varying usage, privacy, access, analysis, and discoverability capabilities and constraints. This paper presents use cases, challenges, opportunities, and recommendations for the exposomics community to establish and mature a federated exposomics data ecosystem.
人类暴露组的规模涵盖了从受孕到死亡期间所接触到的所有环境暴露因素,这在管理、共享和整合大量相关数据类型及现有数据集以造福暴露组学研究和公共卫生方面带来了重大挑战。通过应对这些挑战,暴露组学研究界将能够极大地扩展其汇总研究数据以获取新发现的能力,构建和更新用于建立基于人工智能和机器学习模型的新型暴露组学数据集,快速调查新出现的问题,并推动数据驱动科学的应用。该领域的多样性跨越了多个科学学科子领域和不同的环境背景,因此有必要采用数据联邦方法来连接众多地理上和行政上分散的数据资源,这些数据资源具有不同的使用、隐私、访问、分析和可发现性能力及限制。本文介绍了暴露组学社区建立和完善联邦暴露组学数据生态系统的用例、挑战、机遇和建议。