Suppr超能文献

对加泰罗尼亚地区GCAT队列和普通人群中与健康相关的评估进行加权。

Weighting health-related estimates in the GCAT cohort and the general population of Catalonia.

作者信息

Blay Natalia, Carrasco-Ribelles Lucía A, Farré Xavier, Iraola-Guzmán Susana, Danés-Castells Marc, Violán Concepción, de Cid Rafael

机构信息

Genomes for Life-GCAT Lab, CORE Program, Germans Trias i Pujol Research Institute (IGTP), Badalona, Spain.

Universitat de Barcelona (UB), Barcelona, Spain.

出版信息

Sci Rep. 2025 May 16;15(1):16984. doi: 10.1038/s41598-025-01284-9.

Abstract

Population-based cohorts play a key role in personalized medicine. However, it is known that cohorts are affected by the "healthy volunteer bias" where participants are generally healthier than the broader population, compromising its representativeness. Here, we assess the healthy bias, identifying bias key indicators for representativeness of the GCAT cohort, encompassing 20,000 adult participants of Catalonia, and generating survey raked weights to enhance the cohort's comparability. To assess and correct the bias, we compare multiple variables across sociodemographic, lifestyle, diseases and medication domains. Electronic health records of Catalonia (SIDIAP), the Health Survey of Catalonia (ESCA) and registers from the statistics institute of Catalonia (IDESCAT) and Spain (INE) were used to make the comparisons. We observed that the GCAT cohort is enriched in women and younger individuals, people with higher socioeconomic status and more health conscious and healthier individuals in terms of mortality and chronic disease prevalence. Raked survey weighting identified sex, birth year, rurality, education level, civil status, occupation status, smoking habit, household size, self-perceived health status and number of primary care visits as key weight variables. On average, raked weights reduced the differences by 70% for compared variables, and by 26% in disease prevalence estimates. We can conclude that the application of raked weights has enhanced the cohort's representativeness, improved comparability, and yielded more precise estimates when analysing GCAT data.

摘要

基于人群的队列在个性化医疗中发挥着关键作用。然而,众所周知,队列受到“健康志愿者偏差”的影响,即参与者通常比更广泛的人群更健康,这损害了其代表性。在此,我们评估健康偏差,确定用于评估加泰罗尼亚20000名成年参与者组成的GCAT队列代表性的偏差关键指标,并生成调查加权权重以提高该队列的可比性。为了评估和纠正偏差,我们比较了社会人口统计学、生活方式、疾病和药物治疗领域的多个变量。利用加泰罗尼亚的电子健康记录(SIDIAP)、加泰罗尼亚健康调查(ESCA)以及加泰罗尼亚统计研究所(IDESCAT)和西班牙国家统计局(INE)的登记数据进行比较。我们观察到,GCAT队列在女性和年轻人、社会经济地位较高、健康意识更强以及在死亡率和慢性病患病率方面更健康的个体中富集。加权调查权重确定性别、出生年份、农村地区、教育水平、婚姻状况、职业状况、吸烟习惯、家庭规模、自我感知健康状况和初级保健就诊次数为关键权重变量。平均而言,加权权重使比较变量的差异减少了70%,疾病患病率估计值的差异减少了26%。我们可以得出结论,应用加权权重提高了队列的代表性,改善了可比性,并在分析GCAT数据时产生了更精确的估计值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da6b/12081912/8d44142f267a/41598_2025_1284_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验