Pathak Jyotishman, Kiefer Richard C, Chute Christopher G
Department of Health Sciences Research, Mayo Clinic, Rochester, MN.
AMIA Jt Summits Transl Sci Proc. 2012;2012:10-9. Epub 2012 Mar 19.
The ability to conduct genome-wide association studies (GWAS) has enabled new exploration of how genetic variations contribute to health and disease etiology. One of the key requirements to perform GWAS is the identification of subject cohorts with accurate classification of disease phenotypes. In this work, we study how emerging Semantic Web technologies can be applied in conjunction with clinical data stored in electronic health records (EHRs) to accurately identify subjects with specific diseases for inclusion in cohort studies. In particular, we demonstrate the role of using Resource Description Framework (RDF) for representing EHR data and enabling federated querying and inferencing via standardized Web protocols for identifying subjects with Diabetes Mellitus. Our study highlights the potential of using Web-scale data federation approaches to execute complex queries.
开展全基因组关联研究(GWAS)的能力开启了对基因变异如何影响健康和疾病病因的新探索。进行GWAS的关键要求之一是识别疾病表型分类准确的受试者队列。在这项工作中,我们研究如何将新兴的语义网技术与存储在电子健康记录(EHR)中的临床数据结合应用,以准确识别患有特定疾病的受试者,纳入队列研究。特别是,我们展示了使用资源描述框架(RDF)来表示EHR数据,并通过标准化的网络协议进行联合查询和推理,以识别糖尿病患者的作用。我们的研究突出了使用网络规模数据联合方法执行复杂查询的潜力。