Kreuzthaler Markus, Brochhausen Mathias, Zayas Cilia, Blobel Bernd, Schulz Stefan
Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Graz, Austria.
Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, United States.
Front Med (Lausanne). 2023 Mar 15;10:1073313. doi: 10.3389/fmed.2023.1073313. eCollection 2023.
This paper provides an overview of current linguistic and ontological challenges which have to be met in order to provide full support to the transformation of health ecosystems in order to meet precision medicine (5 PM) standards. It highlights both standardization and interoperability aspects regarding formal, controlled representations of clinical and research data, requirements for smart support to produce and encode content in a way that humans and machines can understand and process it. Starting from the current text-centered communication practices in healthcare and biomedical research, it addresses the state of the art in information extraction using natural language processing (NLP). An important aspect of the language-centered perspective of managing health data is the integration of heterogeneous data sources, employing different natural languages and different terminologies. This is where biomedical ontologies, in the sense of formal, interchangeable representations of types of domain entities come into play. The paper discusses the state of the art of biomedical ontologies, addresses their importance for standardization and interoperability and sheds light to current misconceptions and shortcomings. Finally, the paper points out next steps and possible synergies of both the field of NLP and the area of Applied Ontology and Semantic Web to foster data interoperability for 5 PM.
本文概述了当前在语言和本体论方面面临的挑战,为了全面支持健康生态系统的转型以达到精准医学(5PM)标准,必须应对这些挑战。它强调了临床和研究数据的形式化、受控表示的标准化和互操作性方面,以及智能支持的要求,即要以人类和机器都能理解和处理的方式生成和编码内容。从医疗保健和生物医学研究中当前以文本为中心的通信实践出发,探讨了使用自然语言处理(NLP)进行信息提取的现状。以语言为中心管理健康数据的一个重要方面是集成异构数据源,这些数据源使用不同的自然语言和不同的术语。这正是生物医学本体发挥作用的地方,生物医学本体是领域实体类型的形式化、可互换表示。本文讨论了生物医学本体的现状,阐述了它们对标准化和互操作性的重要性,并揭示了当前的误解和缺点。最后,本文指出了NLP领域以及应用本体和语义网领域的下一步措施和可能的协同作用,以促进精准医学的数据互操作性。