Department of Preventive Medicine, Seoul National University College of Medicine, Seoul, Korea.
Cancer Research Institute, Seoul National University College of Medicine, Seoul, Korea.
J Prev Med Public Health. 2022 Sep;55(5):464-474. doi: 10.3961/jpmph.22.299. Epub 2022 Sep 12.
We introduced the cohort studies included in the Korea Cohort Consortium (KCC), focusing on large-scale cohort studies established in Korea with a prolonged follow-up period. Moreover, we also provided projections of the follow-up and estimates of the sample size that would be necessary for big-data analyses based on pooling established cohort studies, including population-based genomic studies.
We mainly focused on the characteristics of individual cohort studies from the KCC. We developed "PROFAN", a Shiny application for projecting the follow-up period to achieve a certain number of cases when pooling established cohort studies. As examples, we projected the follow-up periods for 5000 cases of gastric cancer, 2500 cases of prostate and breast cancer, and 500 cases of non-Hodgkin lymphoma. The sample sizes for sequencing-based analyses based on a 1:1 case-control study were also calculated.
The KCC consisted of 8 individual cohort studies, of which 3 were community-based and 5 were health screening-based cohorts. The population-based cohort studies were mainly organized by Korean government agencies and research institutes. The projected follow-up period was at least 10 years to achieve 5000 cases based on a cohort of 0.5 million participants. The mean of the minimum to maximum sample sizes for performing sequencing analyses was 5917-72 102.
We propose an approach to establish a large-scale consortium based on the standardization and harmonization of existing cohort studies to obtain adequate statistical power with a sufficient sample size to analyze high-risk groups or rare cancer subtypes.
我们介绍了韩国队列研究联盟(KCC)中包含的队列研究,这些研究主要集中在韩国建立的具有长期随访的大规模队列研究。此外,我们还根据现有的队列研究进行了大数据分析,包括基于人群的基因组研究,对随访情况进行了预测,并对所需的样本量进行了估计。
我们主要关注 KCC 中各个队列研究的特征。我们开发了一个名为“PROFAN”的 Shiny 应用程序,用于在合并现有队列研究时,预测达到一定病例数所需的随访时间。例如,我们预测了 5000 例胃癌、2500 例前列腺癌和乳腺癌以及 500 例非霍奇金淋巴瘤的随访时间。还计算了基于 1:1 病例对照研究的测序分析所需的样本量。
KCC 由 8 个独立的队列研究组成,其中 3 个是基于社区的,5 个是基于健康筛查的队列。基于人群的队列研究主要由韩国政府机构和研究机构组织。根据 50 万名参与者的队列,要达到 5000 例病例,预计的随访时间至少为 10 年。进行测序分析的最小到最大样本量的平均值为 5917-72102。
我们提出了一种方法,通过现有队列研究的标准化和协调,建立一个大型的联盟,以获得足够的统计效力和足够的样本量来分析高危人群或罕见癌症亚型。