Institute of Systems Analysis and Computer Science "Antonio Ruberti" (IASI), National Research Council (CNR), Via dei Taurini 19, 00185, Rome, Italy.
Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, 9500 Euclid Avenue, 44195, Cleveland, Ohio, USA.
BMC Med Inform Decis Mak. 2023 Aug 8;23(1):153. doi: 10.1186/s12911-023-02211-6.
The recent advances in biotechnology and computer science have led to an ever-increasing availability of public biomedical data distributed in large databases worldwide. However, these data collections are far from being "standardized" so to be harmonized or even integrated, making it impossible to fully exploit the latest machine learning technologies for the analysis of data themselves. Hence, facing this huge flow of biomedical data is a challenging task for researchers and clinicians due to their complexity and high heterogeneity. This is the case of neurodegenerative diseases and the Alzheimer's Disease (AD) in whose context specialized data collections such as the one by the Alzheimer's Disease Neuroimaging Initiative (ADNI) are maintained.
Ontologies are controlled vocabularies that allow the semantics of data and their relationships in a given domain to be represented. They are often exploited to aid knowledge and data management in healthcare research. Computational Ontologies are the result of the combination of data management systems and traditional ontologies. Our approach is i) to define a computational ontology representing a logic-based formal conceptual model of the ADNI data collection and ii) to provide a means for populating the ontology with the actual data in the Alzheimer Disease Neuroimaging Initiative (ADNI). These two components make it possible to semantically query the ADNI database in order to support data extraction in a more intuitive manner.
We developed: i) a detailed computational ontology for clinical multimodal datasets from the ADNI repository in order to simplify the access to these data; ii) a means for populating this ontology with the actual ADNI data. Such computational ontology immediately makes it possible to facilitate complex queries to the ADNI files, obtaining new diagnostic knowledge about Alzheimer's disease.
The proposed ontology will improve the access to the ADNI dataset, allowing queries to extract multivariate datasets to perform multidimensional and longitudinal statistical analyses. Moreover, the proposed ontology can be a candidate for supporting the design and implementation of new information systems for the collection and management of AD data and metadata, and for being a reference point for harmonizing or integrating data residing in different sources.
生物技术和计算机科学的最新进展使得世界各地的大型数据库中分布的公共生物医学数据越来越容易获得。然而,这些数据集合远非“标准化”,因此无法协调甚至集成,使得无法充分利用最新的机器学习技术来分析数据本身。因此,面对这些庞大的生物医学数据流,研究人员和临床医生面临着巨大的挑战,因为这些数据非常复杂且高度异质。神经退行性疾病和阿尔茨海默病就是这种情况,在这种情况下,专门的数据集合,如阿尔茨海默病神经影像学倡议 (ADNI) 维护的数据集,就是如此。
本体是一种控制词汇,可用于表示给定领域的数据及其关系的语义。它们通常被用于辅助医疗保健研究中的知识和数据管理。计算本体是数据管理系统和传统本体的组合。我们的方法是:i)定义一个计算本体,该本体代表 ADNI 数据集的基于逻辑的形式概念模型;ii)提供一种方法,将实际数据填充到 Alzheimer Disease Neuroimaging Initiative (ADNI) 中的本体中。这两个组件使得可以对 ADNI 数据库进行语义查询,以便以更直观的方式支持数据提取。
我们开发了:i)一个详细的计算本体,用于从 ADNI 存储库中获取临床多模态数据集,以简化对这些数据的访问;ii)一种将实际的 ADNI 数据填充到此本体中的方法。这种计算本体可以立即简化对 ADNI 文件的复杂查询,从而获得有关阿尔茨海默病的新诊断知识。
所提出的本体将改善对 ADNI 数据集的访问,允许查询提取多元数据集以执行多维和纵向统计分析。此外,所提出的本体可以作为支持新的 AD 数据收集和管理信息系统的设计和实现的候选者,并且可以作为协调或集成来自不同来源的数据的参考点。