Duda Stephany N, Cushman Clint, Masys Daniel R
Vanderbilt University, Nashville, USA.
Stud Health Technol Inform. 2007;129(Pt 1):449-53.
Pre-existing clinical research data sets exchanged in international epidemiology research often lack the elements needed to assess their suitability for use in multi-region meta-analyses or other clinical studies. While the missing information is generally known to local investigators, it is not contained in the files exchanged between sites. Instead, such content must be solicited by the study coordinating center though a series of lengthy phone and electronic communications: an informal process whose reproducibility and accuracy decays over time. This report describes a set of supplemental information needed to assess whether clinical research data from diverse research sites are truly comparable, and what metadata ("data about the data") should be preserved when a data set is archived for future use. We propose a structured Extensible Markup Language (XML) model that captures this information. The authors hope this model will be a first step towards preserving the metadata associated with clinical research data sets, thereby improving the quality of international data exchange, data archiving, and merged-data research using data collected in many different countries, languages and care settings.
国际流行病学研究中交换的现有临床研究数据集往往缺乏评估其是否适合用于多区域荟萃分析或其他临床研究所需的要素。虽然当地研究人员通常知道缺失的信息,但这些信息并不包含在各研究点之间交换的文件中。相反,此类内容必须由研究协调中心通过一系列冗长的电话和电子通信来索取:这是一个非正式过程,其可重复性和准确性会随着时间的推移而下降。本报告描述了一组用于评估来自不同研究点的临床研究数据是否真正可比所需的补充信息,以及在存档数据集以供将来使用时应保留哪些元数据(“关于数据的数据”)。我们提出了一个捕获此信息的结构化可扩展标记语言(XML)模型。作者希望这个模型将成为保存与临床研究数据集相关的元数据的第一步,从而提高国际数据交换、数据存档以及使用在许多不同国家、语言和医疗环境中收集的数据进行合并数据研究的质量。