Gao Qin, Zhang Yan-lei, Xie Zhi-yun, Zhang Qi-peng, Hu Zhang-zhi
Peking University Medical Informatics Center, Beijing 100083, China.
Beijing Da Xue Xue Bao Yi Xue Ban. 2006 Apr 18;38(2):218-21.
A critical factor in the advancement of biomedical research is the ease with which data can be integrated, redistributed and analyzed both within and across domains. This paper summarizes the Biomedical Information Core Infrastructure built by National Cancer Institute Center for Bioinformatics in America (NCICB). The main product from the Core Infrastructure is caCORE--cancer Common Ontologic Reference Environment, which is the infrastructure backbone supporting data management and application development at NCICB. The paper explains the structure and function of caCORE: (1) Enterprise Vocabulary Services (EVS). They provide controlled vocabulary, dictionary and thesaurus services, and EVS produces the NCI Thesaurus and the NCI Metathesaurus; (2) The Cancer Data Standards Repository (caDSR). It provides a metadata registry for common data elements. (3) Cancer Bioinformatics Infrastructure Objects (caBIO). They provide Java, Simple Object Access Protocol and HTTP-XML application programming interfaces. The vision for caCORE is to provide a common data management framework that will support the consistency, clarity, and comparability of biomedical research data and information. In addition to providing facilities for data management and redistribution, caCORE helps solve problems of data integration. All NCICB-developed caCORE components are distributed under open-source licenses that support unrestricted usage by both non-profit and commercial entities, and caCORE has laid the foundation for a number of scientific and clinical applications. Based on it, the paper expounds caCORE-base applications simply in several NCI projects, of which one is CMAP (Cancer Molecular Analysis Project), and the other is caBIG (Cancer Biomedical Informatics Grid). In the end, the paper also gives good prospects of caCORE, and while caCORE was born out of the needs of the cancer research community, it is intended to serve as a general resource. Cancer research has historically contributed to many areas beyond tumor biology. At the same time, the paper makes some suggestions about the study at the present time on biomedical informatics in China.
生物医学研究取得进展的一个关键因素是数据在各领域内部以及跨领域进行整合、重新分发和分析的难易程度。本文总结了美国国立癌症研究所生物信息学中心(NCICB)构建的生物医学信息核心基础设施。核心基础设施的主要产品是caCORE——癌症通用本体参考环境,它是支持NCICB数据管理和应用程序开发的基础设施主干。本文阐述了caCORE的结构和功能:(1)企业词汇服务(EVS)。它们提供受控词汇、词典和叙词表服务,EVS生成NCI叙词表和NCI元叙词表;(2)癌症数据标准存储库(caDSR)。它为通用数据元素提供元数据注册中心。(3)癌症生物信息学基础设施对象(caBIO)。它们提供Java、简单对象访问协议和HTTP - XML应用程序编程接口。caCORE的愿景是提供一个通用数据管理框架,以支持生物医学研究数据和信息的一致性、清晰度和可比性。除了提供数据管理和重新分发的设施外,caCORE还有助于解决数据整合问题。NCICB开发的所有caCORE组件均根据开源许可进行分发,支持非营利和商业实体无限制使用,并且caCORE为许多科学和临床应用奠定了基础。基于此,本文在几个NCI项目中简要阐述了基于caCORE的应用,其中一个是CMAP(癌症分子分析项目),另一个是caBIG(癌症生物医学信息学网格)。最后,本文还展望了caCORE的良好前景,虽然caCORE源于癌症研究社区的需求,但它旨在作为一种通用资源。癌症研究在历史上对肿瘤生物学以外的许多领域都有贡献。同时,本文对当前中国生物医学信息学的研究提出了一些建议。