Department of Medical Informatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, Jiangsu, China.
State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China.
BMC Med Inform Decis Mak. 2024 Jan 19;24(1):18. doi: 10.1186/s12911-023-02405-y.
To develop a Chinese Diabetes Mellitus Ontology (CDMO) and explore methods for constructing high-quality Chinese biomedical ontologies.
We used various data sources, including Chinese clinical practice guidelines, expert consensus, literature, and hospital information system database schema, to build the CDMO. We combined top-down and bottom-up strategies and integrated text mining and cross-lingual ontology mapping. The ontology was validated by clinical experts and ontology development tools, and its application was validated through clinical decision support and Chinese natural language medical question answering.
The current CDMO consists of 3,752 classes, 182 fine-grained object properties with hierarchical relationships, 108 annotation properties, and over 12,000 mappings to other well-known medical ontologies in English. Based on the CDMO and clinical practice guidelines, we developed 200 rules for diabetes diagnosis, treatment, diet, and medication recommendations using the Semantic Web Rule Language. By injecting ontology knowledge, CDMO enhances the performance of the T5 model on a real-world Chinese medical question answering dataset related to diabetes.
CDMO has fine-grained semantic relationships and extensive annotation information, providing a foundation for medical artificial intelligence applications in Chinese contexts, including the construction of medical knowledge graphs, clinical decision support systems, and automated medical question answering. Furthermore, the development process incorporated natural language processing and cross-lingual ontology mapping to improve the quality of the ontology and improved development efficiency. This workflow offers a methodological reference for the efficient development of other high-quality Chinese as well as non-English medical ontologies.
开发一个中文糖尿病本体(CDMO),并探索构建高质量中文生物医学本体的方法。
我们使用了多种数据源,包括中国临床实践指南、专家共识、文献和医院信息系统数据库模式,来构建 CDMO。我们结合了自上而下和自下而上的策略,并整合了文本挖掘和跨语言本体映射。该本体通过临床专家和本体开发工具进行验证,并通过临床决策支持和中文自然语言医疗问答来验证其应用。
目前的 CDMO 包含 3752 个类、182 个具有层次关系的细粒度对象属性、108 个注释属性以及超过 12000 个与英文其他知名医学本体的映射。基于 CDMO 和临床实践指南,我们使用语义 Web 规则语言为糖尿病的诊断、治疗、饮食和药物推荐开发了 200 条规则。通过注入本体知识,CDMO 提高了 T5 模型在与糖尿病相关的真实世界中文医疗问答数据集上的性能。
CDMO 具有细粒度的语义关系和广泛的注释信息,为中文环境下的医疗人工智能应用提供了基础,包括医疗知识图谱的构建、临床决策支持系统和自动化医疗问答。此外,开发过程结合了自然语言处理和跨语言本体映射,以提高本体的质量并提高开发效率。该工作流程为其他高质量中文和非英文医学本体的高效开发提供了方法学参考。