Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA.
Department of Epidemiology, College of Public Health and Health Professions and College of Medicine, University of Florida, Gainesville, FL, USA.
Environ Res. 2021 Jun;197:111185. doi: 10.1016/j.envres.2021.111185. Epub 2021 Apr 24.
An individual's health and conditions are associated with a complex interplay between the individual's genetics and his or her exposures to both internal and external environments. Much attention has been placed on characterizing of the genome in the past; nevertheless, genetics only account for about 10% of an individual's health conditions, while the remaining appears to be determined by environmental factors and gene-environment interactions. To comprehensively understand the causes of diseases and prevent them, environmental exposures, especially the external exposome, need to be systematically explored. However, the heterogeneity of the external exposome data sources (e.g., same exposure variables using different nomenclature in different data sources, or vice versa, two variables have the same or similar name but measure different exposures in reality) increases the difficulty of analyzing and understanding the associations between environmental exposures and health outcomes. To solve the issue, the development of semantic standards using an ontology-driven approach is inevitable because ontologies can (1) provide a unambiguous and consistent understanding of the variables in heterogeneous data sources, and (2) explicitly express and model the context of the variables and relationships between those variables. We conducted a review of existing ontology for the external exposome and found only four relevant ontologies. Further, the four existing ontologies are limited: they (1) often ignored the spatiotemporal characteristics of external exposome data, and (2) were developed in isolation from other conceptual frameworks (e.g., the socioecological model and the social determinants of health). Moving forward, the combination of multi-domain and multi-scale data (i.e., genome, phenome and exposome at different granularity) and different conceptual frameworks is the basis of health outcomes research in the future.
个体的健康状况与其遗传因素以及其对内外部环境的暴露之间存在着复杂的相互作用。过去人们非常关注基因组的特征描述;然而,遗传因素仅占个体健康状况的约 10%,而其余部分似乎由环境因素和基因-环境相互作用决定。为了全面了解疾病的原因并加以预防,需要系统地探索环境暴露,特别是外部暴露组。然而,外部暴露组数据来源的异质性(例如,相同的暴露变量在不同数据源中使用不同的命名,或者相反,两个变量具有相同或相似的名称,但实际上测量不同的暴露)增加了分析和理解环境暴露与健康结果之间关联的难度。为了解决这个问题,使用基于本体的方法开发语义标准是不可避免的,因为本体可以:(1)对异质数据源中的变量提供明确且一致的理解;(2)明确表达和建模变量的上下文以及这些变量之间的关系。我们对现有的外部暴露组本体进行了综述,仅发现了四个相关的本体。此外,现有的四个本体存在局限性:(1)它们经常忽略外部暴露组数据的时空特征;(2)是孤立开发的,没有与其他概念框架(例如,社会生态学模型和健康的社会决定因素)结合。未来,多领域和多尺度数据(即不同粒度的基因组、表型和暴露组)以及不同概念框架的结合将是未来健康结果研究的基础。