Suppr超能文献

使用基于本体的领域知识建模对异构数据源进行语义集成以实现COVID-19的早期检测

Semantic Integration of Heterogeneous Data Sources Using Ontology-Based Domain Knowledge Modeling for Early Detection of COVID-19.

作者信息

Thirumahal R, Sudha Sadasivam G, Shruti P

机构信息

Department of Computer Science and Engineering, P.S.G College of Technology, Coimbatore, India.

出版信息

SN Comput Sci. 2022;3(6):428. doi: 10.1007/s42979-022-01298-4. Epub 2022 Aug 6.

Abstract

The enormous outbreak of biomedical knowledge, the aim of reducing computation and processing costs and the widespread availability of internet connection have created a profuse amount of electronic data. Such data are stored across the globe in various data sources that are semantically, structurally and syntactically different. This decentralized nature of biomedical data has made it difficult to obtain a unified view of the data. Data integration plays a crucial role in enhancing access to heterogeneous data making the retrieval easier and faster. A variety of ontology, machine learning, deep learning and fuzzy logic-based solutions are being developed for heterogeneous data integration. The proposed model concentrates on the automatic ontology-based data integration method that can be effectively deployed and used in the healthcare domain. The proposed model is divided into three phases. The first phase includes the automatic mapping of data and generation of local ontology across heterogeneous data sources, the second phase combines the local ontology models developed in the first phase to create a root global schema mapping and the third phase queries diverse databases to retrieve semantically analogous records. The model is created based on the medical records, chest X-ray details and COVID-19 symptom questionnaire data of various patients distributed across three data sources (SQL, mongodb and excel). Based on the data, the patients who have moderate/higher risk of developing serious illness from COVID-19 are retrieved.

摘要

生物医学知识的大量涌现、降低计算和处理成本的目标以及互联网连接的广泛普及产生了大量的电子数据。这些数据存储在全球各地的各种数据源中,在语义、结构和句法上各不相同。生物医学数据的这种分散性质使得难以获得数据的统一视图。数据集成在增强对异构数据的访问方面起着至关重要的作用,使检索更容易、更快。正在为异构数据集成开发各种基于本体、机器学习、深度学习和模糊逻辑的解决方案。所提出的模型专注于基于本体的自动数据集成方法,该方法可以在医疗保健领域有效部署和使用。所提出的模型分为三个阶段。第一阶段包括跨异构数据源自动映射数据并生成局部本体,第二阶段将第一阶段开发的局部本体模型组合起来以创建根全局模式映射,第三阶段查询不同的数据库以检索语义相似的记录。该模型是基于分布在三个数据源(SQL、mongodb和excel)中的各种患者的病历、胸部X光细节和新冠肺炎症状问卷数据创建的。基于这些数据,检索出有中度/较高风险因新冠肺炎发展成严重疾病的患者。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e88b/9362348/21f24c56f3f1/42979_2022_1298_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验