Xu Hanfei, Schwander Karen, Brown Michael R, Wang Wenyi, Waken R J, Boerwinkle Eric, Cupples L Adrienne, de las Fuentes Lisa, van Heemst Diana, Osazuwa-Peters Oyomoare, de Vries Paul S, van Dijk Ko Willems, Sung Yun Ju, Zhang Xiaoyu, Morrison Alanna C, Rao D C, Noordam Raymond, Liu Ching-Ti
Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA.
Division of Statistical Genomics, Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA.
Eur J Hum Genet. 2021 May;29(5):839-850. doi: 10.1038/s41431-021-00808-x. Epub 2021 Jan 26.
Recent studies consider lifestyle risk score (LRS), an aggregation of multiple lifestyle exposures, in identifying association of gene-lifestyle interaction with disease traits. However, not all cohorts have data on all lifestyle factors, leading to increased heterogeneity in the environmental exposure in collaborative meta-analyses. We compared and evaluated four approaches (Naïve, Safe, Complete and Moderator Approaches) to handle the missingness in LRS-stratified meta-analyses under various scenarios. Compared to "benchmark" results with all lifestyle factors available for all cohorts, the Complete Approach, which included only cohorts with all lifestyle components, was underpowered due to lower sample size, and the Naïve Approach, which utilized all available data and ignored the missingness, was slightly inflated. The Safe Approach, which used all data in LRS-exposed group and only included cohorts with all lifestyle factors available in the LRS-unexposed group, and the Moderator Approach, which handled missingness via moderator meta-regression, were both slightly conservative and yielded almost identical p values. We also evaluated the performance of the Safe Approach under different scenarios. We observed that the larger the proportion of cohorts without missingness included, the more accurate the results compared to "benchmark" results. In conclusion, we generally recommend the Safe Approach, a straightforward and non-inflated approach, to handle heterogeneity among cohorts in the LRS based genome-wide interaction meta-analyses.
近期研究在确定基因-生活方式相互作用与疾病特征的关联时,考虑了生活方式风险评分(LRS),它是多种生活方式暴露因素的汇总。然而,并非所有队列都有关于所有生活方式因素的数据,这导致在合作的荟萃分析中环境暴露的异质性增加。我们比较并评估了四种方法(朴素法、安全法、完全法和调节法),以处理在各种情况下LRS分层荟萃分析中的数据缺失问题。与所有队列都有所有生活方式因素的“基准”结果相比,完全法(仅包括具有所有生活方式成分的队列)由于样本量较小而效能不足,而朴素法(利用所有可用数据并忽略缺失值)的结果略有夸大。安全法(在LRS暴露组中使用所有数据,在LRS未暴露组中仅纳入具有所有生活方式因素的队列)和调节法(通过调节元回归处理缺失值)都略显保守,且产生的p值几乎相同。我们还评估了安全法在不同情况下的性能。我们观察到,纳入无缺失值队列的比例越大,与“基准”结果相比,结果就越准确。总之,我们一般推荐安全法,这是一种直接且不夸大的方法,用于处理基于LRS的全基因组相互作用荟萃分析中队列间的异质性。