Stevens Robert, Jupp Simon, Klein Julie, Schanstra Joost
School of Computer Science, University of Manchester, Oxford Road, Manchester, United Kingdom.
Annu Int Conf IEEE Eng Med Biol Soc. 2011;2011:3708-11. doi: 10.1109/IEMBS.2011.6090629.
Data in biomedicine are characterised by their complexity, volatility and heterogeneity. It is these characteristics, rather than size of the data, that make managing these data an issue for their analysis. Any significant data analysis task requires gathering data from many places, organising the relationships between the data's entities and overcoming the issues of recognising the nature of each entity such that this organisation can take place. It is the inter-relationship of these data and the semantic confusion inherent in the data that make the data complex. On top of this we have volatility in the domain's data, knowledge and experimental techniques that make the processing of data from the domain a distinct challenge, even before those data are organised. In this article we describe these challenges with respect to a project that is using data mining techniques to analyse data from the kidney and urinary pathway (KUP) domain. We are using Semantic Web technologies to manage the complexity and change in our data and we report on our experiences in this project.
生物医学中的数据具有复杂性、易变性和异质性等特点。正是这些特性,而非数据的规模,使得管理这些数据以进行分析成为一个问题。任何重大的数据分析任务都需要从多个地方收集数据,梳理数据实体之间的关系,并克服识别每个实体的性质以便进行这种梳理的问题。正是这些数据之间的相互关系以及数据中固有的语义混乱使得数据变得复杂。除此之外,该领域的数据、知识和实验技术的易变性使得即使在数据被整理之前,处理该领域的数据也是一项独特的挑战。在本文中,我们针对一个正在使用数据挖掘技术分析肾脏和尿路(KUP)领域数据的项目描述这些挑战。我们正在使用语义网技术来管理数据中的复杂性和变化,并报告我们在这个项目中的经验。