Crawford Dana C, Goodloe Robert, Farber-Eger Eric, Boston Jonathan, Pendergrass Sarah A, Haines Jonathan L, Ritchie Marylyn D, Bush William S
Department of Epidemiology and Biostatistics, Institute for Computational Biology, Case Western Reserve University, Cleveland, Ohio, USA.
Hum Hered. 2015;79(3-4):137-46. doi: 10.1159/000381805. Epub 2015 Jul 28.
BACKGROUND/AIMS: Present-day limited resources demand DNA and phenotyping alternatives to the traditional prospective population-based epidemiologic collections.
To accelerate genomic discovery with an emphasis on diverse populations, we--as part of the Epidemiologic Architecture for Genes Linked to Environment (EAGLE) study--accessed all non-European American samples (n = 15,863) available in BioVU, the Vanderbilt University biorepository linked to de-identified electronic medical records, for genomic studies as part of the larger Population Architecture using Genomics and Epidemiology (PAGE) I study. Given previous studies have cautioned against the secondary use of clinically collected data compared with epidemiologically collected data, we present here a characterization of EAGLE BioVU, including the billing and diagnostic (ICD-9) code distributions for adult and pediatric patients as well as comparisons made for select health metrics (body mass index, glucose, HbA1c, HDL-C, LDL-C, and triglycerides) with the population-based National Health and Nutrition Examination Surveys (NHANES) linked to DNA samples (NHANES III, n = 7,159; NHANES 1999-2002, n = 7,839).
Overall, the distributions of billing and diagnostic codes suggest this clinical sample is a mixture of healthy and sick patients like that expected for a contemporary American population.
Little bias is observed among health metrics, suggesting this clinical collection is suitable for genomic studies along with traditional epidemiologic cohorts.
背景/目的:在当前资源有限的情况下,需要有DNA和表型分析的替代方法,以取代传统的基于前瞻性人群的流行病学数据收集方式。
为了加速基因组发现研究,重点关注不同人群,作为“环境相关基因的流行病学架构”(EAGLE)研究的一部分,我们获取了BioVU中所有非欧裔美国人样本(n = 15,863)用于基因组研究。BioVU是范德堡大学的生物样本库,与去识别化的电子病历相关联,是规模更大的“利用基因组学和流行病学构建人群架构”(PAGE)I研究的一部分。鉴于先前的研究曾告诫,与流行病学收集的数据相比,临床收集的数据二次使用存在问题,我们在此展示了EAGLE BioVU的特征,包括成人和儿童患者的计费和诊断(ICD - 9)代码分布,以及与基于人群的、与DNA样本相关的美国国家健康和营养检查调查(NHANES)(NHANES III,n = 7,159;NHANES 1999 - 2002,n = 7,839)在选定健康指标(体重指数、血糖、糖化血红蛋白、高密度脂蛋白胆固醇、低密度脂蛋白胆固醇和甘油三酯)方面的比较。
总体而言,计费和诊断代码的分布表明,该临床样本是健康和患病患者的混合体,与当代美国人群的预期情况相符。
在健康指标方面几乎未观察到偏差,这表明该临床样本集与传统流行病学队列一样,适用于基因组研究。