Environmental and Molecular Toxicology, Oregon State University, Corvallis, Oregon, USA.
Ronin Institute for Independent Scholarship, Montclair, New Jersey, USA.
Environ Health Perspect. 2020 Dec;128(12):125002. doi: 10.1289/EHP7215. Epub 2020 Dec 28.
A critical challenge in genomic medicine is identifying the genetic and environmental risk factors for disease. Currently, the available data links a majority of known coding human genes to phenotypes, but the environmental component of human disease is extremely underrepresented in these linked data sets. Without environmental exposure information, our ability to realize precision health is limited, even with the promise of modern genomics. Achieving integration of gene, phenotype, and environment will require extensive translation of data into a standard, computable form and the extension of the existing gene/phenotype data model. The data standards and models needed to achieve this integration do not currently exist.
Our objective is to foster development of community-driven data-reporting standards and a computational model that will facilitate the inclusion of exposure data in computational analysis of human disease. To this end, we present a preliminary semantic data model and use cases and competency questions for further community-driven model development and refinement.
There is a real desire by the exposure science, epidemiology, and toxicology communities to use informatics approaches to improve their research workflow, gain new insights, and increase data reuse. Critical to success is the development of a community-driven data model for describing environmental exposures and linking them to existing models of human disease. https://doi.org/10.1289/EHP7215.
基因组医学面临的一个关键挑战是确定疾病的遗传和环境风险因素。目前,可用的数据将大多数已知的人类编码基因与表型联系起来,但这些关联数据集中人类疾病的环境部分的代表性非常不足。如果没有环境暴露信息,即使现代基因组学有承诺,我们实现精准健康的能力也将受到限制。要实现基因、表型和环境的整合,需要将数据广泛转化为标准的、可计算的形式,并扩展现有的基因/表型数据模型。实现这种整合所需的数据标准和模型目前尚不存在。
我们的目标是促进社区驱动的数据报告标准和计算模型的发展,从而促进暴露数据在人类疾病的计算分析中的纳入。为此,我们提出了一个初步的语义数据模型,并提供了用例和能力问题,以进一步推动社区驱动的模型开发和改进。
暴露科学、流行病学和毒理学社区非常希望使用信息学方法来改进他们的研究工作流程、获得新的见解和增加数据重用。成功的关键是开发一个用于描述环境暴露并将其与现有的人类疾病模型联系起来的社区驱动的数据模型。https://doi.org/10.1289/EHP7215。