Amin Waqas, Singh Harpreet, Dzubinski Lynda Ann, Schoen Robert E, Parwani Anil V
Department of Biomedical Informatics, University of Pittsburgh Medical Center, Pittsburgh, PA, USA.
J Pathol Inform. 2010 Oct 1;1:22. doi: 10.4103/2153-3539.70831.
The Early Detection Research Network (EDRN) colorectal and pancreatic neoplasm virtual biorepository is a bioinformatics-driven system that provides high-quality clinicopathology-rich information for clinical biospecimens. This NCI-sponsored EDRN resource supports translational cancer research. The information model of this biorepository is based on three components: (a) development of common data elements (CDE), (b) a robust data entry tool and (c) comprehensive data query tools.
The aim of the EDRN initiative is to develop and sustain a virtual biorepository for support of translational research. High-quality biospecimens were accrued and annotated with pertinent clinical, epidemiologic, molecular and genomic information. A user-friendly annotation tool and query tool was developed for this purpose. The various components of this annotation tool include: CDEs are developed from the College of American Pathologists (CAP) Cancer Checklists and North American Association of Central Cancer Registries (NAACR) standards. The CDEs provides semantic and syntactic interoperability of the data sets by describing them in the form of metadata or data descriptor. The data entry tool is a portable and flexible Oracle-based data entry application, which is an easily mastered, web-based tool. The data query tool facilitates investigators to search deidentified information within the warehouse through a "point and click" interface thus enabling only the selected data elements to be essentially copied into a data mart using a dimensional-modeled structure from the warehouse's relational structure.
The EDRN Colorectal and Pancreatic Neoplasm Virtual Biorepository database contains multimodal datasets that are available to investigators via a web-based query tool. At present, the database holds 2,405 cases and 2,068 tumor accessions. The data disclosure is strictly regulated by user's authorization. The high-quality and well-characterized biospecimens have been used in different translational science research projects as well as to further various epidemiologic and genomics studies.
The EDRN Colorectal and Pancreatic Neoplasm Virtual Biorepository with a tangible translational biomedical informatics infrastructure facilitates translational research. The data query tool acts as a central source and provides a mechanism for researchers to efficiently query clinically annotated datasets and biospecimens that are pertinent to their research areas. The tool ensures patient health information protection by disclosing only deidentified data with Institutional Review Board and Health Insurance Portability and Accountability Act protocols.
早期检测研究网络(EDRN)结直肠癌和胰腺癌虚拟生物样本库是一个由生物信息学驱动的系统,可为临床生物样本提供富含高质量临床病理信息。这个由美国国立癌症研究所(NCI)资助的EDRN资源支持转化癌症研究。该生物样本库的信息模型基于三个组件:(a)通用数据元素(CDE)的开发,(b)强大的数据录入工具,以及(c)全面的数据查询工具。
EDRN计划的目标是开发并维持一个虚拟生物样本库以支持转化研究。收集了高质量的生物样本,并用相关的临床、流行病学、分子和基因组信息进行注释。为此开发了一个用户友好的注释工具和查询工具。该注释工具的各个组件包括:CDE是根据美国病理学家学会(CAP)癌症检查表和北美中央癌症登记协会(NAACR)标准开发的。CDE通过以元数据或数据描述符的形式描述数据集,提供了数据集的语义和句法互操作性。数据录入工具是一个基于Oracle的便携式灵活数据录入应用程序,是一个易于掌握的基于网络的工具。数据查询工具便于研究人员通过“点击”界面在仓库内搜索去识别化信息,从而仅将选定的数据元素使用来自仓库关系结构的维度建模结构基本复制到数据集市中。
EDRN结直肠癌和胰腺癌虚拟生物样本库数据库包含多模态数据集,研究人员可通过基于网络的查询工具获取这些数据集。目前,该数据库有2405个病例和2068个肿瘤样本。数据披露受到用户授权的严格监管。高质量且特征明确的生物样本已用于不同的转化科学研究项目以及进一步的各种流行病学和基因组学研究。
具有切实可行的转化生物医学信息学基础设施的EDRN结直肠癌和胰腺癌虚拟生物样本库促进了转化研究。数据查询工具作为一个核心来源,为研究人员提供了一种机制,以便他们能够高效查询与其研究领域相关的临床注释数据集和生物样本。该工具通过仅按照机构审查委员会和《健康保险流通与责任法案》协议披露去识别化数据,确保了患者健康信息的保护。